nteract / papermill

📚 Parameterize, execute, and analyze notebooks
http://papermill.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
5.99k stars 428 forks source link

nbformat 5.1.2 and 5.1.3 cause AttributeError: 'NoneType' object has no attribute 'cells' #769

Closed kks32 closed 10 months ago

kks32 commented 10 months ago

🐛 Upgrading notebook to v4 before writing causes object has no attribute 'cells'

The fix to this issue #691 in commit https://github.com/nteract/papermill/commit/586876597c77fa685f88345ce336b0dadfbea787 upgrades the notebook to nbformat v4. Doing git blame I was able to narrow down to this commit that causes the no attribute cells error in older version of nbformat 5.1.2 and 5.1.3 as specified in requirements.txt: nbformat >= 5.1.2.

Input: pm.execute_notebook("a.ipynb", "b.ipynb", parameters=dict(a="test"))

Error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [4], in <module>
----> 1 pm.execute_notebook("a.ipynb", "b.ipynb", parameters=dict(a="Jayu"))

File /opt/conda/lib/python3.9/site-packages/papermill/execute.py:99, in execute_notebook(input_path, output_path, parameters, engine_name, request_save_on_cell_execute, prepare_only, kernel_name, language, progress_bar, log_output, stdout_file, stderr_file, start_timeout, report_mode, cwd, **engine_kwargs)
     97         if p not in parameter_predefined:
     98             logger.warning(f"Passed unknown parameter: {p}")
---> 99     nb = parameterize_notebook(
    100         nb,
    101         parameters,
    102         report_mode,
    103         kernel_name=kernel_name,
    104         language=language,
    105         engine_name=engine_name,
    106     )
    108 nb = prepare_notebook_metadata(nb, input_path, output_path, report_mode)
    109 # clear out any existing error markers from previous papermill runs

File /opt/conda/lib/python3.9/site-packages/papermill/parameterize.py:102, in parameterize_notebook(nb, parameters, report_mode, comment, kernel_name, language, engine_name)
     99     newcell.metadata['jupyter'] = newcell.get('jupyter', {})
    100     newcell.metadata['jupyter']['source_hidden'] = True
--> 102 param_cell_index = find_first_tagged_cell_index(nb, 'parameters')
    103 injected_cell_index = find_first_tagged_cell_index(nb, 'injected-parameters')
    104 if injected_cell_index >= 0:
    105     # Replace the injected cell with a new version

File /opt/conda/lib/python3.9/site-packages/papermill/utils.py:102, in find_first_tagged_cell_index(nb, tag)
     87 """Find the first tagged cell ``tag`` in the notebook.
     88 
     89 Parameters
   (...)
     99     Whether the notebook contains a cell tagged ``tag``?
    100 """
    101 parameters_indices = []
--> 102 for idx, cell in enumerate(nb.cells):
    103     if tag in cell.metadata.tags:
    104         parameters_indices.append(idx)

AttributeError: 'NoneType' object has no attribute 'cells'

Suggested fix:

Upgrade nbformat to a minimum of 5.2.0 to make it work:

nbformat 5.1.2 has been yanked on PyPi : https://pypi.org/project/nbformat/5.1.2/ due to Name generation process created inappropriate id values. nbformat 5.1.3 also has the same issue. nbformat 5.2.0 onwards this v4 update notebook code change works.

This produces the expected output: Executing: 100% 4/4 [00:01<00:00, 4.37cell/s]