thierry-martinez / pyml

OCaml bindings for Python
BSD 2-Clause "Simplified" License
188 stars 31 forks source link

attach to a running kernel / make kernel accessible #23

Open nilsbecker opened 6 years ago

nilsbecker commented 6 years ago

a feature request: i would like to be able to have access to the same python process from ocaml and from an interactive jupyter python session (console, qtconsole, notebook). in this way i could visualize in python some bigarrays that i populate from ocaml.

can this be achieved by pyml? it seems jupyter provides an infrastructure for separating kernels from clients and this would just have to be used somehow...

UnixJunkie commented 2 years ago

@nilsbecker Can't you do the other way around? I.e. use Py.Run.interactive to launch your jupyter thing once the ocaml side populating your bigarrays has finished?

Disclaimer: I don't use jupyter

nilsbecker commented 2 years ago

i'm not sure that will work. afaics Py.Run.interactive will launch a standard python toplevel. however what i'd need is something that can launch a jupyter python kernel. this is a wrapper around a python interpreter that knows how to send input/output via ZeroMQ and acts as a server. (fwiw such a kernel also exists for using ocaml in jupyter). then, a client process (such as ipython or the jupyter notebook) can attach to that kernel and send python statements to execute etc.

nilsbecker commented 2 years ago

see e.g. https://xeus.readthedocs.io/en/latest/dev.html

nilsbecker commented 2 years ago

the ultimate dream would be if ocaml (via pyml) and an ipython jupyter client could take turns interacting with the running python jupyter kernel. this is already possible for two python clients (notebook and console) interacting with a kernel. in my limited understanding, this would probably require the ocaml side to implement the jupyter client protocol, which seems fundamentally different from interacting with the python C API as pyml currently does. so it's not clear how or if something like this could work.

nilsbecker commented 2 years ago

i did some more research and found a way to launch a kernel from python by using some classes that ipython provides. https://jupyter-client.readthedocs.io/en/stable/wrapperkernels.html
this looks like it could be done from the ocaml side -- then a python jupyter kernel would be launched by pyml. i might try it with the simple 'echo kernel' example they provide when i get around to it.

nilsbecker commented 2 years ago

ok, a first experiment worked. the utop snippet below creates a simple jupyter kernel that just echoes the input and starts it from pyml. another jupyter client can then attach to it an hear the echo.

what remains to be done is to make a kernel that actually continues the running python interpreter on the fly -- can this be done? then one would have at least the following usage scenario: populate python state from ocaml using pyml; then launch the jupyter kernel with that state. use the state later from e.g. jupyter notebook.

what's not working (i think), is modifying the state again from ocaml.

#require "pyml";;

Py.initialize ();;

let m = Py.Import.add_module "ocaml";;

Py.Module.set m "example_value"
  (Py.List.of_list_map Py.Int.of_int [1;2;3]);;

let start = Py.File;;

Py.Run.eval ~start:Py.File "
from ocaml import example_value
print(example_value)"

let contents = {|
import _scproxy
from ipykernel.kernelbase import Kernel
class EchoKernel(Kernel):
    implementation = 'Echo'
    implementation_version = '1.0'
    language = 'no-op'
    language_version = '0.1'
    language_info = {
        'name': 'Any text',
        'mimetype': 'text/plain',
        'file_extension': '.txt',
    }
    banner = "Echo kernel - as useful as a parrot"
    def do_execute(self, code, silent, store_history=True, user_expressions=None,
                   allow_stdin=False):
        if not silent:
            stream_content = {'name': 'stdout', 'text': code}
            self.send_response(self.iopub_socket, 'stream', stream_content)
        return {'status': 'ok',
                # The base class increments the execution count
                'execution_count': self.execution_count,
                'payload': [],
                'user_expressions': {},
               }
import ocaml
print(ocaml.example_value)
|};;

Py.Run.eval ~start contents;;

Py.Run.eval ~start {|
from ipykernel.kernelapp import IPKernelApp
IPKernelApp.launch_instance(kernel_class=EchoKernel)
|};;
UnixJunkie commented 2 years ago

Another idea: if your bigarray is memory mapped to a named file, and if python can also memmap that file, the python process should be able to access the same bigarray than the OCaml side. Then, you are just left with the problem of synchronizing the ocaml process and the python process, so that when the bigarray is read/inspected, it is in a stable/consistent state (i.e. not being accessed for writing at the same time).

nilsbecker commented 2 years ago

yes, i have used that approach in the past and it works. i found it a bit cumbersome though, that's why i'm looking for something that would eventually be simpler and would allow exchanging more data types

nilsbecker commented 2 years ago

I think I found a workable solution! The IPython embedding docs show that

from IPython import embed_kernel
embed_kernel()

gives you an IPython Jupyter kernel that inherits all the current state. I tried it with the following utop session:

#require "pyml";;

Py.initialize ();;

let m = Py.Import.add_module "ocaml";;

Py.Module.set m "example_value"
  (Py.List.of_list_map Py.Int.of_int [1;2;3]);;

let start = Py.File;;

Py.Run.eval ~start {|
from ocaml import example_value
print(example_value)|};;

let contents = {|
from IPython import embed_kernel
embed_kernel()
|};;

Py.Run.eval ~start contents;;

After the last call I get a kernel connection file name printed, to which i can connect with jupyter console --existing kernel-XXXXX.json and print the example_value.

So this is basically what I wanted! The only shortcoming is that I see no way currently to access the kernel again from ocaml. Unlike IPython.embed which lets you close the embedded IPython session to return to the plain Python process, IPython.embed_kernel does not appear to offer this. Also, I would not know how; we would not want to kill the kernel as other clients are still connected to it? Anyway, so far so good.