jupyter / jupyter_client

Jupyter protocol client APIs
https://jupyter-client.readthedocs.io
BSD 3-Clause "New" or "Revised" License
381 stars 282 forks source link

More specifcations for raw data publications needed? #6

Open jankatins opened 9 years ago

jankatins commented 9 years ago

I'm currently thinking how to get a string of code evaluated in a kernel and getting the raw data back to the frontend (in my case that's knitpy a clone of knitr -> knitr supports evaluating codechunk (codechunks ~= cells in the notebook) options in the context of the already evaluated codechunks and in knitpy the codechunks are running in a kernel).

So my current thinking is sending a equivalent of

ret = eval({code}) # code is formated with the right code string...
publish_data({"ret":ret})

to the kernel and have a look at the data_pub message. But I want to support R and other kernels too and just found out that the implementation of publish_data in python uses pickle, which is AFAIK not available in R.

So how would one implement the equivalent of publish_data in R (or other kernels) and how would one handle it reliable on the frontend side (e.g. do I have to keep track what format the kernel sends? How would I handle if the kernel switches the implementation?)?

The current message spec is rather vague on that part: https://ipython.org/ipython-doc/dev/development/messaging.html#raw-data-publication

CC: @flying-sheep as he is currently implementing display machinery in the R kernel

Carreau commented 9 years ago

from the doc :

No frontends directly handle data_pub messages at this time. It is currently only used by the client/engines in IPython.parallel, where engines may publish data to the Client, of which the Client can then publish representations via display_data to various frontends.

So aren't you searching for publish_display_data ?

Carreau commented 9 years ago

(more precisely ipython_kernel/zmqshell.py:ZMQDisplayPublisher.publish)

jankatins commented 9 years ago

No, display_data sends a representation of the data (e.g. string for a list (-> repr([...]) for a list), but I want the raw list, not a string representation -> As far as I understand the message spec, knitpy would be a new frontend, which does want to handle the `data_pub' message directly.

My problem is that the message spec does not mention what are the expected "serialization" methods and how to handle it if other kernels do not understand "pickle" (i.e. what should the R kernel implement?) :-)

minrk commented 9 years ago

raw data publication is only for language-native data (e.g. pickled Python objects). It is only relevant for the R kernel to implement this message if you are also writing a native R client library that wants to receive native data.

jankatins commented 9 years ago

Ok :-(

Then how would I accomplish getting a (simple: int, float, bool, string, list of such things) object from a kernel to the frontend, over different kernels? Use a 'display_json' formatter, which sends a display_data with json mimetype?

My usecase would be to do things like headline="the plot at i=" +i and get that evaluated to headline="the plot at i=5" if the kernel has a variable i=5 (I can also live with just the string, but sometimes it can be a list or an integer or ...)

minrk commented 9 years ago

yes, display_json would be appropriate for sending JSON data.

minrk commented 9 years ago

Or set up a comm, which lets you hook up whatever message structure you want.