jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.72k stars 564 forks source link

[NbConvertApp] WARNING | Config option `kernel_spec_manager_class` not recognized by `NbConvertApp` or How to version control notebooks #1503

Open falconair opened 3 years ago

falconair commented 3 years ago

When trying to clear output of notebooks (to keep git from yelling at me), I get this error: [NbConvertApp] WARNING | Config option kernel_spec_manager_class not recognized by NbConvertApp.

Here is the command I run:

jupyter nbconvert *.ipynb --to notebook --ClearOutputPreprocessor.enabled=True --inplace

Nbconvert version: 6.0.7

This command used to work. A week or two ago, I upgraded my installation to JupyterLab 3.0. I also tried xeus-python, but don't actually use it. As far as I can tell, nbconvert is not related to JL3 or xeus.

ottobricks commented 3 years ago

@falconair, I was having the same issue but then I realized that my command was also trying to convert the .ipynb in ipynb_checkpoints/. By excluding this directory, my problem was solved. I hope it helps.

falconair commented 3 years ago

@ottok92 How did you get nbconvert to exclude ipynb_checkpoints? I am recursively trying to clean out my notebooks in my directories after I deliver lectures on a weekly basis. I'd like to avoid doing this notebook at a time.

ottobricks commented 3 years ago

@falconair, since then I figured out a better way to clear outputs before saving a notebook; Jupyter itself provides a way. You will find a configuration file called jupyter_notebook_config.py under your .jupyter/ folder. This folder is usually located under your home directory, ~/.jupyter in linux.

Once you find the file, look for the line with c.ContentsManager.pre_save_hook. Then replace it with the following code block:

def scrub_output_pre_save(model, **kwargs):
    """scrub output before saving notebooks"""
    # only run on notebooks
    if model["type"] != "notebook":
        return
    # only run on nbformat v4
    if model["content"]["nbformat"] != 4:
        return

    for cell in model["content"]["cells"]:
        if cell["cell_type"] != "code":
            continue
        cell["outputs"] = []
        cell["execution_count"] = None

c.ContentsManager.pre_save_hook = scrub_output_pre_save

Let me know if it worked for you. It certainly did for me.

falconair commented 3 years ago

@ottok92 , so I didn't find the file jupyter_notebook_config.py on my windows machine. But this url shows you how to generate it: https://jupyter-notebook.readthedocs.io/en/stable/config.html . Simple command jupyter notebook --generate-config.

There are a couple of issues:

  1. I'd really like to remove output before committing to GIT, not necessarily after every save.
  2. I'd like to limit this functionality to certain projects. Putting this in the ~/.jupyter folder seems to imply output will be cleared from all notebooks on my system.

Any ideas on these issues?

ottobricks commented 3 years ago

" I'd really like to remove output before committing to GIT, not necessarily after every save."

I see. Well, this method does not imply that you will lose the output on a running notebook ever time you save. As long as you are connected to the notebook, the session will keep your outputs.

"I'd like to limit this functionality to certain projects. Putting this in the ~/.jupyter folder seems to imply output will be cleared from all notebooks on my system."

Another solution that comes to mind is to use Git hooks to make transformations before committing (pre-commit hook). Honestly, I have started converting all notebooks to Python scripts before committing (clears output), as it is much more useful since you can actually do a Git diff on your code. Then, I also configure a Git hook after pull (post-merge) to convert all Python scripts that are under a notebooks/ folder to notebooks. If you feel like implementing this method, here is what your hooks should be -- note that I have been using Jupytext instead of nbconvert as it allows for conversions that are portable to other IDEs such as vscode:

pre-commit:

#!/bin/bash

find -type f -name "*.ipynb" -path "*/notebooks/*" -exec jupytext --to py:percent {} +

exit 0

post-merge:

#!/bin/bash

find -type f -name "*.py" -path "*/notebooks/*" -exec jupytext --to notebook {} +

I suggest adding your hooks to a dedicated directory inside your project (instead of the default .git dir) so that it is shared with anybody that clones your repo. Here is more information on how to configure that.

I understand that this solution is a little more than most people are willing to dive into Git, but once you do it the possibilities are limitless. I hope you take a little time to research any tangent questions that arise and if they are still unsolved, feel free to share and I'll try to help you.

ottobricks commented 3 years ago

By the way, I suggest changing the title of this thread so that it can help others. Maybe something like: "How to integrate notebooks with Git" or "How to version control notebooks".

TheaBehrens commented 2 years ago

@ottok92 How did you get nbconvert to exclude ipynb_checkpoints? I am recursively trying to clean out my notebooks in my directories after I deliver lectures on a weekly basis. I'd like to avoid doing this notebook at a time.

Excluding checkpoints worked for me with the following line in my .gitattributes: .+?(?<!checkpoint)\.ipynb filter=clearoutput

This solves the issue for me.