jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.72k stars 563 forks source link

Optionally share a single kernel when converting multiple notebooks #597

Open ivanov opened 7 years ago

ivanov commented 7 years ago

This would be a way to directly solve issues like #574, and ipython/ipython#6018.

At first, I thought the way to do this would be to have nbconvert support something like the -c flag - but that feels a little too python-centric (In case, dear reader, you were not already aware, one can do ipython -c "x=10" -i to get an interactive IPython shell, with x=10 in the namespace from the get-to).

Here's the kind of use case I think this would support:

nbconvert --execute --share-kernel DataQueryStep.ipynb Analysis.ipynb Plotting.ipynb --to notebook

or

nbconvert --execute --share-kernel ConfigureSomeStateMachine.ipynb RunMonthlyReport.ipynb 

There are various partial-implementations of this kind of functionality: nbrun, nbparametrize, runipy - but I see no reason why this couldn't be supported in nbconvert itself.

Alternatives for --share-kernel might be --keep-kernel or --reuse-kernel or others.

minrk commented 7 years ago

Sounds like a good idea.

mpacer commented 7 years ago

Had a conversation with @ivanov that was super illuminating.

I have a couple of reservations about implementing it as is.

Right now, notebooks are intended to be executed as complete files. They may not always work, but there's (generally speaking) no way to have a notebook file that will execute properly when run in one context but not in another context. In particular, notebooks are not executable only from the command line if they would surely fail in an interactive context.

Once you create these notebooks (especially, Analysis.ipynb in the example above) it's going to be hard to edit them. Or at least it will be hard to edit them while also executing your code. You might be able to get around some of this with something like the %run -i magic, but then that is part of your notebook and so it would potentially break the chain of execution if the parent notebook (e.g., DataQueryStep.ipynb) has side-effects.

In addition to editing these non-self-standing notebooks after they're created, creating them would be challenging. One approach would be to have one giant notebook (so that the state would already be shared), and then manually "split" the notebook file by cutting and pasting all of the cells. But that's a pretty tedious process, especially since it gets you into a state where the resulting notebooks are that much more difficult to edit.

I don't think this is a bad idea, I just think we're going to need to be careful about how exactly we implement it. Specifically I think we need to think about what the story is for bridging this kind of feature between the interactive use case and the CLI use case.

ivanov commented 7 years ago

Before I forget - the way one can work with these split-up workflow notebooks by themselves is by switching the kernel that a notebook is hooked up to. So, at least in JupyterLab, I can open up DataQueryStep.ipynb and run all, and then open Analysis.ipynb and switch its kernel to the one running from DataQueryStep.ipynb

itcarroll commented 5 years ago

Is there a way to do this currently using the nbconvert python API, i.e. some configuration for nbconvert.preprocessors.ExecutePreprocessor? Given the support in Jupyter for sharing kernels, I am surprised this is not similarly easy through nbconvert.

MSeal commented 5 years ago

Late response, but no there's not currently support for this in nbconvert. The library launches an isolated kernel process and manages it itself. Keeping a live kernel around and rehitting it for partial or complete execution starts to look a lot like a jupyter server if you extend the capability, which is one of the reasons it hasn't been explored as much for nbconvert. It would be possible to change nbconvert to have a different preprocessor which takes in the live kernel information instead of launching one, but it's no there today.

marius311 commented 3 years ago

Obviously this issue is super old, but just want to add that something like this could be pretty invaluable for Julia notebooks, where startup is often really costly due to JIT overhead, so subsequent executes could be greatly sped up by reusing the same kernel.

MSeal commented 3 years ago

Take a look at https://github.com/nteract/papermill/issues/583, where someone used a custom engine to support reusing a kernel for each execution. It might fit what you're trying to do @marius311

alonme commented 2 months ago

@MSeal - Hey, just wanted to ping and check if there has been any advancement in this area since a couple of years have passed

Thanks!