mdaeron / D47crunch

Python library for processing and standardizing carbonate clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48, and Δ49 values with fully propagated analytical error estimates.
Other
8 stars 3 forks source link

decide on how to make plotting interfaces consistent #10

Open japhir opened 2 years ago

japhir commented 2 years ago

Continuing the discussion started in #9:

I need to go back to the various plotting functions, but you're right that they lack consistency. The thing to decide first is what should be the default behavior?

* return a matplotlib `Figure` object by defaut, which means that most aspects of the figure may be modified before displaying/saving it; return `None` and save to disk with some optional parameter (e.g. `output = 'savefig'`)

* return `None` and save to disk by default; return a `Figure` object with some optional parameter (e.g. `output = 'fig'`)

In both cases, output = 'ax' could be used to ask for a matplotlib Axes object, which allows including the plot in a larger figure.

Any opinion on the best default behavior? Any change needs to be deliberate because it will (slightly) break backward compatibility.

I had the same problem with clumpedr's plotting functions, where I had to decide on whether plots were generated by default, whether they were returned as plot objects or printed or saved. In the end I decided to remove plotting from all the processing functions themselves, and to let the plots always return the plot object for further tweaking, letting the user decide how to print/save for themselves.

I'm not sure what is best though: are the users of D47crunch going to be python users? In that case it's probably nicest to return the plot objects for futher tweaking. If they're going to be python novices (like me) it might be nicer to save PDFs by default, so that you don't need to do more in python but can just look at the output. I think in this case, the second option may actually be better! Where the advanced users can specify that they want the plotting function to return the plot object in stead. What are your thoughts?

mdaeron commented 2 years ago

I think the default target user should be be someone who doesn't know/care about Python code, so these functions should by default create a new figure and save it to a pdf file. There should be an option no not save it to a file but return the new figure (for those wishing to further modify the figure before saving) and an option to not create a new figure but instead return the newly created Axes() instance (for those wishing to include the plot in a larger, more complex figure).

Something like this should fit the bill (needs more code to create enclosing directories if they don't exist yet)

import numpy as np
from matplotlib import pyplot as ppl

def foo(output = 'fig', savefig = 'foo.pdf'):
    if output == 'fig':
        fig = ppl.figure()
    ax = ppl.subplot(111)
    x = np.linspace(0,10)
    y = np.sin(x)
    ppl.plot(x, y, 'r-', lw = 2)
    if output == 'fig':
        if savefig:
            ppl.savefig(savefig)
        return fig
    elif output == 'ax':
        return ax

If that seems suitable, the next thing to do is to check that all plotting functions follow this template.

japhir commented 2 years ago

(needs more code to create enclosing directories if they don't exist yet)

Do you mean if you would specify savefig = "bar/baz/foo.pdf" ? Or should it create the figure in a subdirectory (parameter?) by default, as is currently done for e.g. plot_sessions:


    def plot_sessions(self, dir = 'output', figsize = (8,8)):
        '''
        Generate session plots and save them to disk.

        **Parameters**

        + `dir`: the directory in which to save the plots
        + `figsize`: the width and height (in inches) of each plot
        '''
        if not os.path.exists(dir):
            os.makedirs(dir)

I also thought the axes creation should happen within the elif, right?

import numpy as np
from matplotlib import pyplot as ppl

def foo(output = 'fig', savefig = 'foo.pdf'):
    if output == 'fig':
        fig = ppl.figure()
    x = np.linspace(0,10)
    y = np.sin(x)
    ppl.plot(x, y, 'r-', lw = 2)
    if output == 'fig':
        if savefig:
            ppl.savefig(savefig)
        return fig
    elif output == 'ax':
        ax = ppl.subplot(111)  # moved this down here
        return ax
mdaeron commented 2 years ago

The path issues can be simplified:

from os import makedirs
from pathlib import Path
makedirs(Path('foo/bar/baz.txt').parent)
# creates ./foo and/or ./foo/bar if they do not exist yet

The ax should be created before plotting anything (calling it again is likely to delete previously existing plots). Alternatively, don't define ax and simply return ppl.gca() ("get current axes").

The implementation is straightforward; what matters is to define the optimal user-facing behavior. It seems to me that a single parameter covering both save directory and filename is simpler than the existing parameters.

japhir commented 2 years ago

Hmm I'm still undecided, but I do think splitting the filename from the directory arguments might be more intuitive to some users. Or maybe even set one image_directory variable or something so that all plots are put there? This would allow you to use the same output directory for multiple figures without re-typing but still allow you to use different filenames...

Perhaps the directory argument should be nil and default to the current working directory. It would be nice if it would also support absolute paths (e.g. /home/user/japhir/Pictures/nicefigurefromD47crunch.pdf should work?)

timpol commented 10 months ago

Sorry to stick my oar in, but for what it's worth, my preference would be for something like this:

from matplotlib import pyplot as ppl

def foo(ax=None, figname=None):

    # data generated by foo
    x = (1, 2, 3)
    y = (1, 4, 9)

    if ax is not None:
        # Plot data on user supplied ax; do not return anything.
        ax.plot(x, y)
    else:
        fig, ax = ppl.subplot(111)
        ax.plot(x, y)
        if figname is not None:
            ppl.savefig(figname)
        # Always return fig. If user only wants it saved to disk, then they
        # don't have to assign the function output to anything when calling foo.
        return fig