marimo-team / marimo

A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
https://marimo.io
Apache License 2.0
8.06k stars 291 forks source link

[anywidget compat]: Nested binary data needs to be base64-decoded in frontend #2366

Closed manzt closed 2 months ago

manzt commented 2 months ago

Describe the bug

Kind of an tricky edge case. I tried to find and fix where the backend comm message is being unpacked on the frontend but got lost...

The binary data from a state dict is extracted prior to sending over the comms. marimo seems to handle this just fine when a top level piece of state (e.g., a binary traitlet):

class Widget(anywidget.AnyWidget):
   _esm = "index.js"
   bdata = traitlets.Bytes().tag(sync=True)

mo.ui.anywidget(Widget(bdata=b"Hello world")) # model.get("bdata") // DataView

but the buffer unpacking/packing needs to be applied recursively on the state objects. So nested data:

class Widget(anywidget.AnyWidget):
   _esm = "index.js"
   bdata = traitlets.Dict().tag(sync=True)

mo.ui.anywidget(Widget(bdata={ "value": b"Hello world" })) # model.get('bdata') // { value: <base64-encoded string> }

Kind of weird behavior, but I know some widgets that make use of this (e.g., simple serialization of a numpy array):

Environment

{
  "marimo": "0.8.17",
  "OS": "Darwin",
  "OS Version": "23.6.0",
  "Processor": "arm",
  "Python Version": "3.12.5",
  "Binaries": {
    "Browser": "128.0.6613.138",
    "Node": "v22.1.0"
  },
  "Dependencies": {
    "click": "8.1.7",
    "importlib-resources": "missing",
    "jedi": "0.19.1",
    "markdown": "3.7",
    "pygments": "2.18.0",
    "pymdown-extensions": "10.9",
    "ruff": "0.6.5",
    "starlette": "0.38.5",
    "tomlkit": "0.13.2",
    "typing-extensions": "missing",
    "uvicorn": "0.30.6",
    "websockets": "12.0"
  },
  "Optional Dependencies": {}
}

Code to reproduce

numpy example. In Jupyter, it displays true and in marimo this displays false

import anywidget
import traitlets
import numpy as np

class Widget(anywidget.AnyWidget):
    _esm = """
    function render({ model, el }) {
      let arr = model.get("arr");
      el.innerText = arr.bytes instanceof DataView;
    }
    export default { render };
    """
    arr = traitlets.Dict().tag(sync=True)

arr = np.array([1,2,3])
Widget(arr={ "bytes": arr.tobytes(), "shape": arr.shape, "dtype": str(arr.dtype) })
mscolnick commented 2 months ago

@manzt - are you on main or the latest release. i wonder if this got fixed by https://github.com/marimo-team/marimo/pull/2358

(i will test it right now)

EDIT: I don't think it was fixed, ill look into this now

manzt commented 2 months ago

latest release

manzt commented 2 months ago

btw, I'm pretty sure with this feature we can get most of jupyter-scatter working.