nlesc-sherlock / Rig

Apache License 2.0
0 stars 0 forks source link

Let iPython notebook run in the background #7

Open dafnevk opened 8 years ago

dafnevk commented 8 years ago

Instead of recreating and running the generated notebook each time, it would be better to have the kernel running in the background, and evaluating new cells on this kernel. The flow is then as follows: Init: • Read in all ipython notebooks • Create a dict with the avaible blocks (key=name, val=cel) • Initiate workflow as empty list

Load data: • Add LoadData block to the flow • Add a sample data block to the flow • Generate a cache block to the flow • Generate the notebook with the blocks • Start up the spark docker • Start up the ipython notebook

If the user selects a block: • Can we get argument values from the selected columns etc? Otherwise ask user. • Json generated with blockname, arguments • Backend picks up this json and retriev • Regenerate notebook again • Run the new cell (headless)

If the user removes a block: • Regenerate from cell 3 (after sampling/caching)

If the user selects generate: • (remove the sample block) Generate and output the notebook

Some useful links: http://jupyter-client.readthedocs.io/en/latest/ https://gist.github.com/minrk/2620735

dafnevk commented 8 years ago

We can use the spark-flask docker to run a spak/flask server where we run the notebook on. Some considerations: