Converting notebook to .py script fails when using %matplotlib inline - known issue

JamiesHQ commented 7 years ago

This is a known Jupyter issue: https://github.com/jupyter/nbconvert/issues/503, https://github.com/jupyter/nbconvert/pull/507

cell magics are not supported by Python, only by IPython. Jupyter team is working on the machinery to make this work.

@hlapp

hlapp commented 7 years ago

Just to clarify for those coming across this issue, the failure isn't with the conversion itself (which does not fail, and converts cell magics to python function calls), but with using python to execute the result of the conversion.

mpacer commented 7 years ago

All ipython magics that are in a notebook are converted to a call to get_ipython() which will be executable using ipython and not python. There is no plan to change that as far as I can tell because it's behaving as expected.

The problem with %matplotlib inline is that it is explicitly a backend that allows you to view images in the notebook, so it has no way to be interpreted if run as part of a script.

It should be mentioned that if students are using matplotlib 2.0+ they shouldn't need to explicitly invoke a backend when executed from a notebook (I believe it defaults to matplotlib) and so this shouldn't cause any problems.

hlapp commented 7 years ago

All ipython magics that are in a notebook are converted to a call to get_ipython() which will be executable using ipython and not python.

For the record, that didn't work either for the workshop we just held. (Using the Anaconda installer etc.)

There is no plan to change that as far as I can tell because it's behaving as expected.

I humbly submit that this be reconsidered. An extracted Python script should run as any Python script would - under Python. It would be very confusing to have to teach students that yes, the Python extractor extracts Python, but it's not 'real' Python but 'iPython'-Python.

Perhaps this is totally fine for the developers. However, as a teacher I would then simply not teach Python extraction, and refer students who ask about it anyway (which they'll almost certainly do) to the NBconvert developers.

The problem with %matplotlib inline is that it is explicitly a backend that allows you to view images in the notebook, so it has no way to be interpreted if run as part of a script.

That's not entirely true. Obviously, when converting to either Markdown or HTML the images do get created just fine (using default choices that would work just as well for a Python script), so what to do with them if there is no backend at the time of viewing has already been solved.

It should be mentioned that if students are using matplotlib 2.0+ they shouldn't need to explicitly invoke a backend when executed from a notebook (I believe it defaults to matplotlib) and so this shouldn't cause any problems.

I can't even tell myself which version of matplotlib I'm using, let alone controlling which one I'm using. So again, this may be fine for developers, but it's a non-starter when teaching unless it's guaranteed to work through the installer.

choldgraf commented 7 years ago

@hlapp my intuition is that these shortcuts etc are meant more for developers as shortcuts to make things faster for doing interactive computing. I agree totally that it's really annoying when you have to call matplotlib inline in each notebook, which will then fail if you try to run it in a non-interactive setting.

@mpacer I just tried opening a new notebook and running a simple viz plot, and it doesn't seem like inline is active by default. e.g.:

mpacer commented 7 years ago

@choldgraf I was basing that off of: https://github.com/jupyter/nbconvert/pull/507#issuecomment-271321734

from @takluyver

Once matplotlib 2 comes out (which will be any day now, as it has been for a year or so ;-), it should no longer be necessary to invoke %matplotlib inline, because it will pick a suitable default backend when you load matplotlib in the kernel. Of course, lots of people will still do it because they're used to it, but it lowers the priority a little.

Interesting @carreau & @NelleV, could you explain where my confusion lies?

choldgraf commented 7 years ago

I've just been going through that thread - going to take discussion of this issue over there because I think it'll be more actionable there!

takluyver commented 7 years ago

I think I had overlooked a detail: with mpl 2, it detects that it's in an IPython kernel and uses the inline backend by default, but it doesn't set up our post-execute hook which displays any unclosed mpl figures. So it works if you display a figure object by normal means, but it doesn't automatically show up whenever you call plotting functions.

choldgraf commented 7 years ago

ah I see...so what is the preferred user behavior in this case? To just put fig or call plt.show at the end of any cells with plots? This seems very clunky...

takluyver commented 7 years ago

(That was informed guesswork, btw - don't take it as gospel)

We kind of subverted our normal display model to make using mpl more convenient, and what you're seeing now is more like the general display system.

I'm not really sure what recommended behaviour is in this case. For now, I'd put fig or plt.show() at the end. Maybe mpl's ion() function will work? You might get multiple renderings, though, I'm not sure. We may need some new functionality to support it more cleanly.

mpacer commented 7 years ago

@hlapp There are a number of different issues here, I'll try to tackle them in different responses to keep them well contained.

All ipython magics that are in a notebook are converted to a call to get_ipython() which will be executable using ipython and not python. For the record, that didn't work either for the workshop we just held. (Using the Anaconda installer etc.)

"All ipython magics that are in a notebook are converted to call get_ipython(), which will be executable using ipython and not python" is true, but it is not true that all calls to get_ipython() will work when executed through ipython, the case in point being the % matplotlib inline magic backend. It will fail when executed with both python and ipython, but for different reasons. With python, it doesn't have get_ipython in the name space and never executed get_ipython, in ipython it had issues with the backend declaration when it did execute get_ipython.

See the following abbreviated tracebacks (on a notebook with only one cell with %matplotlib inline).

For python

$ python Untitled1.py
Traceback (most recent call last):
  File "Untitled1.py", line 6, in <module>
    get_ipython().magic('matplotlib inline')
NameError: name 'get_ipython' is not defined

For ipython

$ ipython Untitled1.py
---------------------------------------------------------------------------
UnknownBackend                            Traceback (most recent call last)
/~/Downloads/Untitled1.py in <module>()
      4 # In[1]:
      5
----> 6 get_ipython().magic('matplotlib inline')
      7
      8

… 

UnknownBackend: No event loop integration for 'inline'. Supported event loops are: qt, qt4, qt5, gtk, gtk2, gtk3, tk, wx, pyglet, glut, osx

mpacer commented 7 years ago

@hlapp

There is no plan to change that as far as I can tell because it's behaving as expected. I humbly submit that this be reconsidered. An extracted Python script should run as any Python script would - under Python. It would be very confusing to have to teach students that yes, the Python extractor extracts Python, but it's not 'real' Python but 'iPython'-Python.

Because magics do not have a meaning outside of ipython and so if you are using them, the only way to convert magics to a script is to convert them using the get_ipython call, which requires using ipython to execute the script. We can change the exporter name to "ipython" if you would prefer (instead of "python"), I see why this could cause confusion as it is.

@takluyver @carreau @minrk Why is it called a python exporter when (if you use ipython magics) it really is an ipython exporter?

choldgraf commented 7 years ago

ok so I can confirm that calling plt.ion when using MPL 2.0 causes the same behavior as calling matplotlib inline:

@hlapp what I think we should do in the short term is ensure that students are using MPL 2.0 for the classes (this should be the default for new anaconda / enthought installs anyway, and is up on pip, so if we tell students to do pip install --upgrade matplotlib it should work). Then, we should make sure that plt.ion() gets called at the beginning of the notebook. This will cause plots to be auto-shown without needing any of the ipython-specific syntax like %matplotlib inline.

If for some reason a student can't use MPL 2.0, then we'd just need to tell them to use %matplotlib inline and comment this out before exporting to a python script.

@takluyver or @mpacer do you see a problem with this strategy?

takluyver commented 7 years ago

Why is it called a python exporter when (if you use ipython magics) it really is an ipython exporter?

Because many notebooks don't use magics, in which case it is, for practical purposes, a Python exporter. And if you're exporting to a script, you're probably doing so to run it with python. We don't have a good answer for what to do with magics - it may not sense with all magics removed, but equally there are some that make no sense outside of an interactive environment (e.g. %load). So I'd advise people to remove/replace them manually if they want to convert a notebook to a .py script.

ok so I can confirm that calling plt.ion when using MPL 2.0 causes the same behavior as calling matplotlib inline:

Have you checked with multiple plotting commands in one cell (e.g. add a legend)? That's where I'm concerned you may see the plot multiple times.

choldgraf commented 7 years ago

you get multiple plots shown if there are figures built up in the queue, so to speak, before ion is called.

takluyver commented 7 years ago

OK, that looks like a workable approach at present, then. :-)

choldgraf commented 7 years ago

ok, I'm opening an issue in another repository that actually has ipynb files in it :)

mpacer commented 7 years ago

It would be very confusing to have to teach students that yes, the Python extractor extracts Python, but it's not 'real' Python but '[I]Python'-Python.

Perhaps this is totally fine for the developers. However, as a teacher I would then simply not teach Python extraction, and refer students who ask about it anyway (which they'll almost certainly do) to the NBconvert developers.

Magic commands require the IPython environment to be able to be used as they specifically act as simpler interfaces to much more complicated commands. So it is not that nbconvert doesn't extract "classic" Python (which I prefer to "real", feel free to substitute while reading), but that in using an IPython magic, you are writing code that needs IPython to function. Put simply: If students are not writing "classic" Python, there is no way to extract "classic" Python scripts.

I don't fully understand why the solution would not be to teach students that if they use commands that require the ipython machinery to run in the notebook that they need to use ipython to execute the resulting script. If they do not use any specific ipython magics, then the python extraction should work as you expect and produce a "classic" Python script.

For getting around this problem, you could instead teach them to use one of the other methods for specifying a matplotlib backend (see this link). However, that still would not get around the fact that inline backend is not valid when running a command as a script.

mpacer commented 7 years ago

The problem with %matplotlib inline is that it is explicitly a backend that allows you to view images in the notebook, so it has no way to be interpreted if run as part of a script.

That's not entirely true. Obviously, when converting to either Markdown or HTML the images do get created just fine (using default choices that would work just as well for a Python script), so what to do with them if there is no backend at the time of viewing has already been solved.

I'll try to address different parts about this separately.

Obviously, when converting to either Markdown or HTML the images do get created just fine

Unless you pass --execute (or some other indication that the ExecutePreprocessor should be used), the images are not created at the time when you run jupyter nbconvert. Given that it is not the case that the images are created when converting to either Markdown or HTML, I would say that it is also not the case that it is obvious that images are created when converting.

The images that I think you are referring to are images that are visible in the notebook itself are visible as outputs. These are stored as encoded data in the notebook itself (which is why many people clear outputs automatically when using notebooks with version control).

Because they are already available in the outputs, the HTML and Markdown converters do not need to execute the code that would have generated those outputs. They just collect the data from the notebook itself.

The HTML exporter (by default) embeds these images as dataURIs that can be directly passed and do not need to create any side-effect files (see Extracting Figures using the HTML Exporter for a description).

The Markdown exporter has no way (as far as I know) of embedding images as dataURIs and so it creates an accessory folder to store the output figures and includes links to those figures using standard operating system file paths.

In the case where you do want to execute the code and rerun your analyses before exporting, the code would be run through the kernel that the notebook has access to, thereby avoiding any of the issues around backends that are compatible only in an interactive context because they will be run in a (headless) interactive context.

At the time of exporting to a Python script, you could export all of the current outputs using the ExtractOutputPreprocessor that the Markdown and RST exporters use. That would give you access to all of the figures that are currently in the notebook.

default choices that would work just as well for a Python script

However, from the conversation (and from discussions with @JamiesHQ) I think that that is not what you wanted. What you wanted is the ability to extract a python script that would write out images as files when it is run.

In that context, there are no "default choices that would work just as well for a Python script" because these are not the kinds of choices that the HTML or Markdown exporters make (see above), and thus no choices exist to provide default choices.

If you want a script to write out files, you need to write that explicitly and that holds whether you do this in the notebook or in a script.

All nbconvert does is ask whether the notebook has access to the files. If the ExecutePreprocessor is used, it simply executes the code as though it were run in the notebook, which creates the images in question and stores them in the notebook. Then nbconvert finds the images in the notebook and extracts them.

so what to do with them if there is no backend at the time of viewing has already been solved.

The issue is not that there is no backend, but that the declared backend is not available when run from a python script. Matplotlib has default values when no backend is declared, but I believe it will always have a backend (even if just writing files to the file system). However, that is a matplotlib issue so I'm a little fuzzy on the details.

mpacer commented 7 years ago

It should be mentioned that if students are using matplotlib 2.0+ they shouldn't need to explicitly invoke a backend when executed from a notebook (I believe it defaults to matplotlib) and so this shouldn't cause any problems. I can't even tell myself which version of matplotlib I'm using, let alone controlling which one I'm using. So again, this may be fine for developers, but it's a non-starter when teaching unless it's guaranteed to work through the installer.

If you use

import matplotlib
matplotlib.__version__

that is the best way to trace down which version of which version of matplotlib you are using

If you have used Conda, you can also run conda list. If you use pip you can also run pip list. These are somewhat less reliable from the perspective of the notebook, because the kernel may be tied to a different environment from the one in which you run conda|pip list.

If you need to install a new version of a python package you can use pip install -U <package_name> or conda update <package_name>. You can also qualify which version when installing packages (and using conda update) by appending a ==, !=, and => followed by a version number. NB: you will need to include version numbers in quotes if you are going to use >= so that it doesn't interpret the > as a bash command.

So, for example, conda install "matplotlib>=2.0" or pip install -U "matplotlib>=2.0" would both work.

shett044 commented 6 years ago

@JamiesHQ When converting from IPYNB to python use following direction:

Create a template file named "simplepython.tpl". Copy the below statements.

{% extends 'python.tpl'%}
## Comments magic statement
{% block codecell %}
{{  super().replace('get_ipython','#get_ipython') if "get_ipython" in super() else super() }}
{% endblock codecell %}

Save simplepython.tpl.
Type in command line:

jupyter nbconvert --to python 'IPY Notebook' --template=simplepython.tpl --stdout

hlapp commented 6 years ago

@shett044 if you choose Export->Python from the Jupyter menu (which I think invokes nbconvert under the hood?), will this kind of template be employed by default?

choldgraf commented 6 years ago

@hlapp nope, this would require us adding the template described above to this repository, then instructing people to convert to python with the command given above. I think a workable solution here is to use plt.ion instead of %matplotlib inline

hlapp commented 6 years ago

I think a workable solution here is to use plt.ion instead of %matplotlib inline

Interesting. If you know how best to work this into the lesson, don't hesitate 😊

choldgraf commented 6 years ago

I suggested how best to do this in this comment above. Can change the notebooks to do this in a couple of weeks, but for now I am on my honeymoon so shouldn't be checking github in the first place :-)

hlapp commented 6 years ago

I suggested how best to do this in this comment above.

Ah yes. Sorry I missed that.

for now I am on my honeymoon so shouldn't be checking github in the first place

Exactly!!! What are you doing here 😃😏

Reproducible-Science-Curriculum / publication-RR-Jupyter

Converting notebook to .py script fails when using %matplotlib inline - known issue #31