ouseful-testing / jupyterlite_remote_kernel

Can we connect to a remote Jupyter server or MyBinder kernel from Jupyterlite
0 stars 0 forks source link

Can we connect to a remote MyBinder service? #1

Open psychemedia opened 2 months ago

psychemedia commented 2 months ago

I'm not sure how responsive MyBinder is anymore. Here is claude.ai's suggestion, in part as a response to a possible myBinder magic:

import { Kernel } from '@jupyterlite/kernel';
import { PromiseDelegate } from '@lumino/coreutils';

class BinderKernel extends Kernel {
  constructor(options) {
    super(options);
    this._ready = new PromiseDelegate();
    this._websocket = null;
    this._executeRequests = new Map();
    this.githubRepo = options.githubRepo || 'jupyter/notebook';  // Default repo
  }

  async start() {
    const pyodide = await loadPyodide();
    await pyodide.loadPackage(['micropip', 'pyodide-http']);
    await pyodide.runPythonAsync(`
      import micropip
      await micropip.install(['websockets', 'aiohttp'])
      import asyncio
      import json
      from pyodide.http import pyfetch
      from websockets import connect

      class BinderConnection:
          def __init__(self, github_repo):
              self.github_repo = github_repo
              self.ws = None
              self.kernel_url = None
              self.token = None

          async def start_binder(self):
              binder_url = f'https://mybinder.org/build/gh/{self.github_repo}/master'

              async def fetch_stream():
                  response = await pyfetch(binder_url, method='GET', mode='streamed')
                  async for chunk in response.iter_content():
                      yield chunk.decode()

              async for line in fetch_stream():
                  try:
                      msg = json.loads(line.split('data: ')[1])
                      if msg['phase'] == 'ready':
                          self.token = msg['token']
                          self.kernel_url = f"wss://{msg['url'].split('//')[1]}ws/kernel"
                          return
                  except:
                      pass

          async def connect(self):
              await self.start_binder()
              kernel_id = await self._start_kernel()
              self.ws = await connect(f"{self.kernel_url}/{kernel_id}?token={self.token}")

          async def _start_kernel(self):
              url = f"https://{self.kernel_url.split('//')[1].split('/')[0]}/api/kernels?token={self.token}"
              response = await pyfetch(url, method='POST')
              data = await response.json()
              return data['id']

          async def send_execute_request(self, code):
              msg = {
                  'header': {'msg_type': 'execute_request'},
                  'content': {'code': code}
              }
              await self.ws.send(json.dumps(msg))

          async def receive_message(self):
              return json.loads(await self.ws.recv())

      binder_kernel = BinderConnection('${this.githubRepo}')
      await binder_kernel.connect()
    `);

    this._ready.resolve();
  }

  async executeRequest(content) {
    await this._ready.promise;

    const msg_id = crypto.randomUUID();
    const promise = new PromiseDelegate();
    this._executeRequests.set(msg_id, promise);

    const pyodide = await loadPyodide();
    await pyodide.runPythonAsync(`
      await binder_kernel.send_execute_request(${JSON.stringify(content.code)})
    `);

    // Start listening for messages
    this._startMessageLoop();

    return promise.promise;
  }

  async _startMessageLoop() {
    const pyodide = await loadPyodide();
    while (true) {
      const msg = await pyodide.runPythonAsync(`
        await binder_kernel.receive_message()
      `);

      if (msg.header.msg_type === 'execute_reply') {
        const promise = this._executeRequests.get(msg.parent_header.msg_id);
        if (promise) {
          promise.resolve(msg);
          this._executeRequests.delete(msg.parent_header.msg_id);
        }
      } else if (msg.header.msg_type === 'stream') {
        // Handle output
        this._stream(msg);
      } else if (msg.header.msg_type === 'display_data') {
        // Handle display data
        this._display(msg);
      }
      // Add more message type handlers as needed
    }
  }

  _stream(msg) {
    const content = {
      name: msg.content.name,
      text: msg.content.text
    };
    this.stream(content);
  }

  _display(msg) {
    this.displayData(msg.content);
  }
}

// Export the kernel
export default BinderKernel;

This updated implementation does the following:

It takes a githubRepo parameter in the constructor, which specifies the GitHub repository to use with MyBinder. The start method now includes logic to start a MyBinder instance based on the specified GitHub repo. It uses pyodide-http to make HTTP requests from within Pyodide, which is necessary for communicating with the MyBinder API. The BinderConnection class handles the process of starting a Binder instance, waiting for it to be ready, and then connecting to the Jupyter kernel it provides. Once connected, it behaves similarly to our previous remote kernel implementation, sending execute requests and handling responses.

To use this kernel:

Save this code as binder-kernel.js in your JupyterLite project. Modify your JupyterLite configuration to include this custom kernel. Add something like this to your jupyter-lite.json:

jsonCopy{
  "jupyter-config-data": {
    "kernelSpecs": {
      "binder-kernel": {
        "name": "binder-kernel",
        "spec": {
          "display_name": "Binder Kernel",
          "language": "python",
          "argv": ["binder-kernel.js"]
        }
      }
    }
  }
}

When creating a new notebook in JupyterLite, users can select the "Binder Kernel" option. To specify a different GitHub repo, you would need to modify the kernel creation process to accept parameters. This might involve creating a custom kernel selection UI in JupyterLite.

This implementation provides a seamless way for JupyterLite users to leverage MyBinder's infrastructure and run code in environments specified by GitHub repositories, all within the browser. Note that this approach has some limitations:

It may take some time to start up, especially if the Binder environment needs to be built from scratch. It relies on the availability and responsiveness of the MyBinder service. The kernel will not persist after the browser session ends, as it's running on a temporary Binder instance.

psychemedia commented 2 months ago

I've not tested the above, but I wonder - should I seed the conversation first with an example of the echo kernel?