educational-technology-collective / jupyterlab-pioneer

A JupyterLab extension for generating and exporting JupyterLab event telemetry data.
https://jupyterlab-pioneer.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
11 stars 2 forks source link

Need to have an exporter which takes configuration from the notebook metadata #12

Closed cab938 closed 10 months ago

cab938 commented 11 months ago

The exporter API currently is a Jupyter config file which includes a function to do the exporting and a set of arguments which the function can process however it wants to (e.g. URL endpoints, passwords, whatever it needs to do its job). The function itself takes in a JSON object which captures the event information.

To log using the notebook metadata with this exporter approach one would need to capture the OpenNotebook (because all of the notebook metadata is passed) and then use that metadata to direct logging activity (e.g. it might identify an S3 URL, or a websocket, or whatever).

A significant issue with this, however, is that the exporting is not currently set up on a per-notebook model basis. All of the events from any notebook model go to the same set of export functions. This means that the "NotebookMetadataExporter" function would need to keep track of which notebookmodel parameters have been seen (e.g. by looking for Open Notebook events) and then match incoming event data against this to determine where to send the telemetry data. The issue is that this list will grow unbounded -- there is no guarantee of a "Close Notebook" event, so the function needs to keep all of the "NotebookMetadataExporter" parameters around forever.

cab938 commented 11 months ago

The only solution which comes to my mind is to move the details of the exporter configuration into the router. The router is aware of the notebook model, because it is tracking this. Then all of the memory issues with an unbounded list sit in the browser, and get wiped when the notebooks are closed and do not effect other users if the system is multiuser.

Discussion? This adds much more complexity to the router. For instance, what do we do if the notebook has this special metadata, but there is no exporter configured in the configuration file to take advantage of it? Ignore it?

I do think there is a risk (security wise) to trust the metadata fully in the notebook -- e.g. a behavior of "read the metadata for the exporter from the notebook and create the exporter automatically" doesn't feel right. Is there another solution? Maybe it does nothing (e.g. telemetry for that notebook is ignored) if there is no associated exporter? Does this mean we need some kind of shared secret between the "NotebookMetadataExporter" and the notebook.ipynb? Or do we consider this "experimental", and have no secret and just warn users when the notebookmetadataexporter is active?