twosigma / beakerx

Beaker Extensions for Jupyter Notebook
http://BeakerX.com
Apache License 2.0
2.8k stars 382 forks source link

notebook namespace #89

Closed scottdraves closed 10 years ago

scottdraves commented 10 years ago

We need a way to communicate between cells from different evaluators. I propose this be done with a notebook namespace: a collection of variables that are part of the document model. Their values would be JSON.

Access to the variables would be by a module "beaker" written for each plugin and loaded automatically. As a baseline we would use strings for the variables with a basic get/set API: beaker.get("x") and beaker.set("x", 1). Some languages in particular Javascript might be able to better and you could just say beaker.x and beaker.x = 1. Maybe, but we could live without it.

The big question is where the namespace is stored. Since the namespace is part of the document model (including for example be saved in the notebook), naturally it should primarily exist there, in the JS in the web browser. But if code in plugin X accesses a value, it would have to make a REST call to Beaker core, which would then use comet to get the value. This round-trip could be expensive. So I propose keeping a copy of the namespace in Beaker Core, and using comet to keep the two copies synchronized. REST calls are not as fast as in-memory access but avoiding going over the WAN could be a huge win.

Mostly this synchronization can be done after the caller returns. The danger is that you will write to one side and then evaluate a cell on the other side that reads it before it has been updated. Without some kind of locking this would be possible because different channels are used for each plugin and the core. So when accessing the namespace you would have to block until you were guaranteed to get the right value. Along with the cell evaluation results you could return a flag that said there might be namespace changes coming. Similarly with the cell evaluation call you can have a flag that there might be changes coming. In both cases you only block if you need access.

This is just a performance optimization, so a first pass at implementation should probably stick with keeping one copy in the JS.

It would be possible to keep it just in the Core and not in the client. I think they are most likely to be accessed from JS though (for application development and for saving the document). Another possibility is to keep a copy in every plugin, but that would be expensive in memory, though it does give very fast access.

The big alternative to this is using cell evaluation results as the bus. Ie in one cell referring to the output of another cell. This has two problems, one is there's no obvious name to use, and two because values are not explicitly shared the optimization of duplicating the namespace becomes impractical.

scottdraves commented 10 years ago

On the client the minimum JS that completes this should be pretty straightforward, just allowing the extra information in the notebook model and responding to the cometd requests to access it.

But that ignores an important improvement which was not mentioned in the design above, which is the visualization of the namespace. It seems obvious that the user would want to see a table with all the names and values they have defined. This could be a panel like the output log that can be hidden/shown. Or maybe it could be inline in the the notebook, like an extra cell at the end (or beginning).

In addition to just seeing the values one can imagine other actions you would want: to change a value, define a new value, and alert me when a value changes? who knows.

tomlagatta commented 10 years ago

I would like a feature where I can easily share variables / values between cells of different programming languages. Here's my use case:

It would be great if I could either output that LaTeX report as a PDF, or output it as a ready-made TEX file (with images in directory), that I could then turn into a PDF using TeXShop.

scottdraves commented 10 years ago

Thanks Tom. The commit above has a branch with the beginnings of an implementation of this. Hopefully we'll complete it or something like it soon.

In the meantime, as a workaround, you can write the data to a temp file in one language, and read it in the other.

tomlagatta commented 10 years ago

Thanks, Scott. I'll stay tuned for the implementation, and use the workaround if I need it in the meantime.

On Mon, May 5, 2014 at 4:17 PM, Scott Draves notifications@github.comwrote:

Thanks Tom. The commit above has a branch with the beginnings of an implementation of this. Hopefully we'll complete it or something like it soon.

In the meantime, as a workaround, you can write the data to a temp file in one language, and read it in the other.

— Reply to this email directly or view it on GitHubhttps://github.com/twosigma/beaker-notebook/issues/89#issuecomment-42233918 .

scottdraves commented 10 years ago

implementation mostly done on this branch: https://github.com/twosigma/beaker-notebook/tree/namespace

scottdraves commented 10 years ago

done!