Jupyter Notebook Roadmap

The initial version of Jupyter support established in the Jupyter MVP is still extremely limited.

This is a tracking issue intended to plot out the longer-term goals for using notebooks with Rerun.

Much of this is motivated by the desire to support:

https://github.com/rerun-io/rerun/issues/1630

The Long-term Vision

Two Different Workflows

The general pattern of creating a view withing a notebook always involves 3 distinct pieces:

Create a Recording
Send data to the Recording
Combine the Recording with a Blueprint to create a View

Although Step 1 always comes first, it's important to note that step 2 and 3 can happen in either order.

When 2 comes before 3 we call this the "end-of-cell" workflow. We don't emit the viewer until the end of cell execution, which means the viewer is loading a single static recording payload. It's important to note that this same recording can still be used to create additional views of the data (either in the same cell or subsequent cells), without needing to run the computation again. This is practically similar to our current "save / open RRD" standalone modes, and is the only mode supported by the current jupyter MVP.

When 3 comes before 2, however, we call this an "incremental-cell" workflow. In this mode the View context is created first, and then data is incrementally live-streamed into it. This could all happen from within a single long-running cell or multiple cells could be used to incrementally update a viewer instance output by a previous cell. This is practically more similar to the standalone "connect to viewer" mode.

Creating Blueprints for Views

Regardless of which workflow is being employed, ergonomic APIs for creating these blueprints are an essential part of the Jupyter experience. A user must be able to:

Choose what data exists in their view
Choose the type of view that will be created
Potentially Layout multiple views
Apply additional styling to that data
Filter the data in different ways

We suspect at least two ways that users might want to construct these blueprints:

An object-oriented builder-style API, such as:

view1 = rec.3dview(base='car', paths=['car/sensors/lidar', 'car/detections/*'])
view2 = rec.2dview(base='world', paths=['world/map', 'car/trajectory', 'car/detections/bbox'])
rr.horizontal_layout(view1, view2)

A "config-file" style JSON or YAML document.

TBD(@jleibs) continue populating this section.

Tracking Backlog.

[x] https://github.com/rerun-io/rerun/issues/1808
[ ] Investigate introducing a "RecordingHandle" to the Python SDK to simplify some of the global-context/state pieces. The handle would expose all the existing python APIs and eventually allow creation of a blueprint which would return a jupyter-renderable object.
```
rh = rr.start_recording()
rh.log_image(...)
rh.log_points(...)
rh.blueprint().view('img/')
```
The existing rr APIs would just pass through to the default recording handle. (https://github.com/rerun-io/rerun/issues/1903)
- [ ] Long-term optionally move rerun from iFrame back to a single instance controlling multiple canvases. This would allow us to have multiple views (from different blueprints) on top of the same data without the need to duplicate memory. (Note: this won't work in google colab, so we'll still always want to support a iframe-isolated model).
- [ ] The "right" way of outputting data in jupyterlab is ultimately with a custom mime-type and renderer extension. (https://jupyterlab.readthedocs.io/en/stable/user/file_formats.html). Ideally we would still support both inlined recordings or a reference to a recording-id on an existing server-instance. NOTE: this might not be as portable or worth the effort.
- [ ] Port to ipywidgets. See: this guide. This seems like the best candidate for cross-platform support including bidirectional sync for features like retrieving blueprint data back from the viewer or eventually supporting use-callbacks.
- [ ] Rather than encoding the entire rrd as a blob, we should be able to use the ipython websocket to incrementally send (batched?) messages to the rerun server. ipywidgets (above) is a good candidate for handling this kind of data-flow

rerun-io / rerun