jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.75k stars 569 forks source link

Running a list of notebooks via CLI doesn't find the correct kernel for each notebook #1235

Open sgibson91 opened 4 years ago

sgibson91 commented 4 years ago

Hi šŸ‘‹

I'm trying to set up nbconvert in my CI pipeline to identify when my notebooks break. I have a .tests subdirectory with two notebooks in, one uses the Python kernel and the other uses the R kernel.

The command I'd like to execute in my CI is the following:

jupyter nbconvert --execute .tests/*.ipynb

However, the bug I'm seeing is that nbconvert tries to convert both notebooks using the same kernel. Which kernel it chooses seems to be dependent on which notebook it tries to convert first.

For example:

# nbconvert tries to run both notebooks with the python3 kernel
jupyter nbconvert --execute .tests/test-python-notebook.ipynb .tests/test-r-notebook.ipynb
[NbConvertApp] Converting notebook .tests/test-python-notebook.ipynb to html
[NbConvertApp] Executing notebook with kernel: python3
[NbConvertApp] Writing 319395 bytes to .tests/test-python-notebook.html
[NbConvertApp] Converting notebook .tests/test-r-notebook.ipynb to html
[NbConvertApp] Executing notebook with kernel: python3
[NbConvertApp] ERROR | Error while converting '.tests/test-r-notebook.ipynb'
Traceback (most recent call last):
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 410, in export_single_notebook
    output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 179, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 197, in from_file
    return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/html.py", line 95, in from_notebook_node
    return super(HTMLExporter, self).from_notebook_node(nb, resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/templateexporter.py", line 307, in from_notebook_node
    nb_copy, resources = super(TemplateExporter, self).from_notebook_node(nb, resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 139, in from_notebook_node
    nb_copy, resources = self._preprocess(nb_copy, resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 316, in _preprocess
    nbc, resc = preprocessor(nbc, resc)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
    return self.preprocess(nb, resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess
    nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess
    nb.cells[index], resources = self.preprocess_cell(cell, resources, index)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 448, in preprocess_cell
    raise CellExecutionError.from_cell_and_msg(cell, out)
nbconvert.preprocessors.execute.CellExecutionError: An error occurred while executing the following cell:
------------------
version
------------------

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-605b5d1778ad> in <module>
----> 1 version

NameError: name 'version' is not defined
NameError: name 'version' is not defined
# nbconvert tries to run both notebooks with the ir kernel
jupyter nbconvert --execute .tests/test-r-notebook.ipynb .tests/test-python-notebook.ipynb
[NbConvertApp] Converting notebook .tests/test-r-notebook.ipynb to html
[NbConvertApp] Executing notebook with kernel: ir
[NbConvertApp] Writing 273969 bytes to .tests/test-r-notebook.html
[NbConvertApp] Converting notebook .tests/test-python-notebook.ipynb to html
[NbConvertApp] Executing notebook with kernel: ir
[NbConvertApp] ERROR | Error while converting '.tests/test-python-notebook.ipynb'
Traceback (most recent call last):
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 410, in export_single_notebook
    output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 179, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 197, in from_file
    return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/html.py", line 95, in from_notebook_node
    return super(HTMLExporter, self).from_notebook_node(nb, resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/templateexporter.py", line 307, in from_notebook_node
    nb_copy, resources = super(TemplateExporter, self).from_notebook_node(nb, resources, **kw)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 139, in from_notebook_node
    nb_copy, resources = self._preprocess(nb_copy, resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 316, in _preprocess
    nbc, resc = preprocessor(nbc, resc)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
    return self.preprocess(nb, resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess
    nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess
    nb.cells[index], resources = self.preprocess_cell(cell, resources, index)
  File "/Users/sgibson/anaconda3/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 448, in preprocess_cell
    raise CellExecutionError.from_cell_and_msg(cell, out)
nbconvert.preprocessors.execute.CellExecutionError: An error occurred while executing the following cell:
------------------
# Import packages
import numpy as np
import pandas as pd
import scipy as sp
import matplotlib
import matplotlib.pyplot as plt
from scipy.stats import linregress
------------------

Error in parse(text = x, srcfile = src): <text>:2:8: unexpected symbol
1: # Import packages
2: import numpy
          ^
Traceback:

ERROR: Error in parse(text = x, srcfile = src): <text>:2:8: unexpected symbol
1: # Import packages
2: import numpy
          ^

However when the notebooks are run separately, everything works as expected.

# ir kernel
jupyter nbconvert --execute .tests/test-r-notebook.ipynb
[NbConvertApp] Converting notebook .tests/test-r-notebook.ipynb to html
[NbConvertApp] Executing notebook with kernel: ir
[NbConvertApp] Writing 273969 bytes to .tests/test-r-notebook.html
# python3 kernel
jupyter nbconvert --execute .tests/test-python-notebook.ipynb
[NbConvertApp] Converting notebook .tests/test-python-notebook.ipynb to html
[NbConvertApp] Executing notebook with kernel: python3
[NbConvertApp] Writing 319395 bytes to .tests/test-python-notebook.html

So nbconvert is obviously correctly identifying the kernel from the metadata for each notebook individually. The behaviour I expected is that when nbconvert is passed a list of notebooks, it would identify and use the kernel requested by each notebook in turn, not just stick with the first one that comes up.

Any help you can offer on this topic would be really appreciated! āœØ

My CI config: https://github.com/alan-turing-institute/bridge-data-environment/blob/master/.github/workflows/repo2docker-pull-requests.yml PR with broken CI due to this behaviour: https://github.com/alan-turing-institute/bridge-data-environment/pull/25

Nbconvert version: 5.6.1

MSeal commented 4 years ago

So this might be a good issue to move over to nbclient as the execute notebook logic now resides over there for the upcoming next release of nbconvert. However, nbconvert should be either raising an exception or launching a new kernel for each notebook even with the proxy to nbclient. I'll mark this as a bug to look into. Thanks for raising.