jupyter / notebook

Jupyter Interactive Notebook
https://jupyter-notebook.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
11.59k stars 4.86k forks source link

"Execution Chains" feature #1343

Open yrevar opened 8 years ago

yrevar commented 8 years ago

Wouldn't it be great if Jupyter can suggest a few common patterns of cell execution sequences? I think it would be very useful because a lot of times "run all above/below" aren't sufficient. For instance, in order to achieve multiple objectives from the same notebook, I need to toggle various cells manually by commenting them (Another feature request explained in [1]). I think it can be greatly simplified if we can extract various patterns of cell executions from the past usage of the notebook which I believe should be trivial. Once a pattern or few patterns are collected, it's just a matter of providing a simple user friendly interface to save, select and edit them. For example, it can show these patterns by highlighting cells involved in it, or in a separate view with cells connected in chains (acyclic stack of cells) that user can play with.

[1] Enabling/disabling cells. This is very basic feature, and probably simple too. So I believe it'd be supported in future releases.

Carreau commented 8 years ago

I think we already went down that road on the ML a few years back, the answer was IIRC:

Yes it exists, it's called a programming language.

If you really want that, you should likely use function and control flow, then loop over the results /parameters you like.

Comment cells is simple if condition, out of order execution is calling simply writing then calling a function.

If you want to create mass reports, you can use things like https://github.com/takluyver/nbparameterise, and/or write frontends extensions that make this easier.

yrevar commented 8 years ago

Thanks for your suggestions @Carreau 👍

I like your quote :) and I agree that one could use very basic functions and control flow to achieve similar results. However, there's a subtle difference in what is proposed as opposed to mere relying on programming language. Many good programmers use smart editors, don't they? This is exactly what I am looking for. Extracting patterns of cell executions can be done with some simple machine learning technique which, if integrated with a nicer user interface, would be a helpful feature for users who are working on a multi-purpose notebook or running various experiments within a single notebook.

An example of very basic interface would be something like this:

Set of exec. cells most frequently used
1-2-3-5-7-10 - [click here to execute this sequence] [click here to name this sequence]
1-2-3-5-7-11 - [click here to execute this sequence] [click here to name this sequence]
1-2-4-5-7-10 - [click here to execute this sequence] [click here to name this sequence]
1-2-4-5-7-11 - [click here to execute this sequence] [click here to name this sequence]

The final implementation could be way more elegant than this.

If it's too much of an overhead or adds up more complexity, I'd also be inclined to use programming features instead because essentially it only adds up 2 more variables to the notebook. But, I think it'd be nice to see, more cleaner and way much easier to manage if this was learned automatically as we no longer need 2 if conditions in each of the cells that depend on it.

I hope this clarifies the need further. Let me know what you think.

Carreau commented 8 years ago

yes, I see what you mean, the issues is that this does not scale to offline execution of notebook, and in case where you repeat the 1-2-4-5 pattern, then you (likely?) want to extract that into a step which you want to reuse between notebooks. Which is ... a python module.

I think that there are much more basic step that we could teach people, like roughly 1/8 I would of our user don't know that Shift-Enter execute a cell, and click the "play" button everytime in the toolbar.

I don't have statistics, but how many know they can:

And the machine learning trick could be pushed much further : mylist[len(mylist)-3:len(mylist)] is a typical things to improve.

mylist = f():
for x in things:
   mylist.append(x)

Is also regularly seen. All this can be experimented with with notebook extensions, but have low-chace of making it in the core directly.

I would look at https://github.com/jupyter-incubator/contentmanagement that try to solve some of these issues.