Open drammock opened 1 year ago
Does this also include plots appearing twice (if show=True
) or once (if show=False
) due to Jupyter (or any interactive Python interpreter) automatically outputting any object's repr?
Does this also include plots appearing twice (if
show=True
) or once (ifshow=False
) due to Jupyter (or any interactive Python interpreter) automatically outputting any object's repr?
fig = plt.figure()
print(1)
still shows the figure fig
even though the last form is not the figure.
plt.show
is the matplotlib-equivalent (and what we effectively call under the hood) so I actually still prefer our current name / -1 on changing it
Does this also include plots appearing twice (if
show=True
) or once (ifshow=False
) due to Jupyter (or any interactive Python interpreter) automatically outputting any object's repr?
underlyingly these are related but distinct problems I think.
In the looping-over-axes case, a figure has been created, then after each subplot is filled in with data it is asked to be shown via plt.show()
, so show=False
prevents showing the intermediate/incomplete figures. Then at the end of the cell, all figures generated in that cell are shown by Jupyter/ipympl, which in the background is keeping track of all the figures and showing them by calling their _repr_html_
method (technically the _repr_html_
method of the figure canvas, IIUC). That is a core behavior of notebooks and is the thing we can't change. So show=False
yields 1 figure.
In the plots-appearing-twice-because-repr case, one plot is the normal Python/IPython behavior of showing the repr
(or here, _repr_html_
) of each expression (or in the notebook: only the last expression of the cell); assigning to a variable or adding a trailing semicolon will supress that repr
(but passing show=False
doesn't actually fix this problem, see example). The second fig is again the notebook keeping track of all figures and showing all of them.
If you put this in a notebook cell you still get 2 figures:
import mne
sample_data_folder = mne.datasets.sample.data_path()
sample_data_raw_file = (sample_data_folder / 'MEG' / 'sample' /
'sample_audvis_raw.fif')
raw = mne.io.read_raw_fif(sample_data_raw_file, verbose=False, preload=False)
raw.crop(tmax=10).pick('eeg').load_data()
raw.plot_psd(show=False) # <- doesn't work for the double-repr case! must use `;` or assign to var
plt.show
is the matplotlib-equivalent (and what we effectively call under the hood) so I actually still prefer our current name / -1 on changing it
Keeping things as-is / closing as "not broken, won't fix" seems like a non-starter to me. It requires that users be familiar with details of how matplotlib works. From a dev perspective it makes sense --- show
parameter means "should I call plt.show
?" But if a user comes to MNE-Python without already knowing matplotlib, it's reasonable that they'd think "show=False
should suppress the figure". Right now the documentation for the show
parameter for plot_psd
literally just says "Show the figure if True" so I think there's a pretty big explanatory gap to overcome.
Improving those docs would help, but it seems likely that no matter how clearly we document that "show=False means show 0 figs in interactive console sessions and 1 fig in notebook for-loops and 2 figs in notebooks if it's the last line in the cell and you don't assign to a var or use a semicolon", there will always be some users asking questions because they didn't understand it (or didn't read it at all). One way to forestall that is to make the param name a bit more reflective of what it's doing --- people probably have fewer preconceptions about what delay_plotting
should do compared to just show
(hopefully they'll think: "Delay until when? I better read this param description to find out..."). Probably there's an even better name that I haven't thought of, or some other approach besides parameter migration that would work better? I'm all ears, but I think we have to do something.
Keeping things as-is / closing as "not broken, won't fix" seems like a non-starter to me. It requires that users be familiar with details of how matplotlib works...
I think that's true of our plotting functions, though, and it's a reasonable requirement / part of us using matplotlib that users will need to understand (or learn) a bit about it and how it behaves. As far as the show
kwarg is concerned, we are basically using matplotlib how it's designed AFAICT, so to me this is a matplotlib problem and should be fixed there. My preferred option is thus to document better that show=True
means "call show at the end", at least to start.
To illustrate this part of the issue, in #11513 I just used the code fig = plt.subplots()
, and this produced the behavior that the user did not expect (i.e., that the inline plot is shown). So I don't think we should fight this battle at the MNE end... maybe mpl needs a context manager to delay such showing or something, not sure.
Improving those docs would help, but it seems likely that no matter how clearly we document that "show=False means show 0 figs in interactive console sessions and 1 fig in notebook for-loops and 2 figs in notebooks if it's the last line in the cell and you don't assign to a var or use a semicolon", there will always be some users asking questions because they didn't understand it (or didn't read it at all).
From a practical standpoint, if we document these behaviors nicely in some tutorial or doc then we can quickly point them there. It seems like no matter what we have to to document the various potentially odd-seeming behaviors of matplotlib, and to me avoiding them should happen with some matplotlib rather than MNE magic/context manager etc.
But if we absolutely need to change some behavior in MNE, I'm not sure we gain too much by changing the argument name. We could add show='delayed'
(to magically fix the plt.subplots()
always being shown?) or show='never'
(to magically fix the _repr_html_
issue?) or whatever if we needed to, rather than adding a new param. This does ultimately have to do with how/when the figure is displayed/shown, so the name isn't so bad.
From a practical standpoint, if we document these behaviors nicely in some tutorial or doc then we can quickly point them there. It seems like no matter what we have to to document the various potentially odd-seeming behaviors of matplotlib, and to me avoiding them should happen with some matplotlib rather than MNE magic/context manager etc.
But if we absolutely need to change some behavior in MNE, I'm not sure we gain too much by changing the argument name. We could add
show='delayed'
(to magically fix theplt.subplots()
always being shown?) orshow='never'
(to magically fix the_repr_html_
issue?) or whatever if we needed to, rather than adding a new param. This does ultimately have to do with how/when the figure is displayed/shown, so the name isn't so bad.
This is starting to sound like a better proposal than what I came up with. I'll stew on it for a few days and let others chime in too.
I think we should not abstract away how other packages work. We should expect users to have at least a very basic understanding of Matplotlib, and the same is true for notebooks. If someone wants to work in a notebook, it is not our task to make everything work "magically". For example, values are printed automatically (as HTML), which we cannot change - that's just how it is. Instead, I think we should improve our onboarding docs and maybe include very short Matplotlib and notebook primers (and then refer to the corresponding docs).
Having said that, I like adding new behavior to an existing parameter - this seems to be a very nice solution which doesn't break any old code and hopefully makes things a bit clearer for future code.
@drammock
The way I remember the problem with looping over axes when show=True
(the default of plot_topomap for example) is that in environment like Spyder with matplotlib inline mode the user will see a figure with all axes blank except the first one. This can be especially puzzling for new users because they would have to know about show
argument to get their intended figure, and before they learn this (or switch to the interactive mode) the could spend considerable time trying to figure out what they are doing wrong. So, my general perspective is that the default of show=True
causes more problems than benefits and I would prefer show=False
was the default, at least for axis-level figures. If what @larsoner suggest is similar to this, then I'm +1 for the change!
Many times (most recently in #11513) users complain / get confused about plot behavior in Jupyter Notebooks, specifically the problem of figures getting "shown" too early and having empty subplot axes (typically when pre-generating a figure w/ subplots and looping over the subplot axes). Most of our plotting functions have a
show=False
option which will prevent the problem, but the parameter name is really confusing and I don't think it's very well documented how it should be used/interpreted in a notebook context.One possible option would be to change the parameter to something like
delay_plotting=True
so that its name better reflects what is common to notebooks and interactive console sessions --- namely that the figure won't be shown immediately when it's created. Obviously, removingshow
will break a lot of user code, so I'd be inclined to retainshow
as a@legacy
alias for the new param (and warn or error if both params are provided?)I invite other suggestions of how to address this problem.
cross-ref to #11165 where a lot of related discussion can be found
cc @sam-s