Enable "Open in Notebook" for datasets on VEDA

batpad commented 5 months ago

For datasets in VEDA, we should have a button to "Open dataset in notebook" - this would open a notebook on the VEDA hub with basic starter code to read the dataset and visualize it on a map.

This would likely be roughly similar to Microsoft Planetary Computer's "Example Notebook" feature per dataset: https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2#Example-Notebook

We would need to figure out what good starting point code is for different dataset types, and how we want to do the basic visualization on a map.

We would need to think about how to implement this - from conversation with @yuvipanda - this is possibly best implemented as a Jupyter Server Extension, with rough architecture that looks something like:

We create a Jupyter Server Extension that will accept some query parameters with details of the dataset, and then use some templating mechanism to generate the .ipnyb file and then open it in the Hub.

There's likely a few different things to figure out here, and we should likely start with the desired user experience around this.

Currently, we do have a mechanism for users to manually create notebooks and associate them with datasets. We want to continue to give users the ability to do this, but also want a robust mechanism to generate "default" notebooks for all datasets without requiring manual intervention.

Not sure who to reach out to to better understand the current mechanics in the VEDA UI to link notebooks, and what the best way is to sketch out what would be appropriate example starter notebooks for the various types of datasets.

cc @wildintellect @aboydnw @yuvipanda

wildintellect commented 5 months ago

All of the MS notebooks are written by hand. It's a criteria to have the collection added to STAC. I know I wrote at least 1. That said on Landsat look we did a STAC template for python that could inject params. So with common datatypes this might be possible relying on STAC information.

aboydnw commented 5 months ago

Is this similar to the GHG Center functionality, except it would drop them right into a notebook for the dataset they were viewing?

image (7)

As for the actual design of the connection, my guess is @faustoperez and/or @j08lue would be best for that

wildintellect commented 5 months ago

I forgot to mention the STAC render extension, the front end is already moving towards + stac_ipyleaflet.

j08lue commented 5 months ago

Not sure who to reach out to to better understand the current mechanics in the VEDA UI to link notebooks, and what the best way is to sketch out what would be appropriate example starter notebooks for the various types of datasets.

Yeah, @faustoperez, @wildintellect and I can speak to existing user flows and meaningful improvements. We've had a couple of tickets about this bridge from UI to JupyterHub, but not gotten beyond the nbgitpuller links to static, hand-crafted notebooks yet.

As you can see from the first ticket, we have some stubs for integrating (Python, R) code snippet generation in STAC Browser, too.

https://github.com/radiantearth/stac-browser/pull/381

But none of these bridges are dynamic, i.e. users really pick up from where they left off. Would be great to work on a possible case for that.

Please gather us any time if you want to bounce ideas.

j08lue commented 5 months ago

A JupyterHub extension that loads arbitrary code (passed as URL or base64 string), actually developed for VEDA a good while ago, is this one: https://github.com/Navteca/jupyterlab-pasarela

It provides the JupyterHub receiving end, but what we needed was a templated code generator that would constrain what code actually gets passed into JupyterHub.

batpad commented 5 months ago

I think the idea for this would be roughly:

Build a Jupyter Server Extension that will provide a URL endpoint that accepts some query parameters to:
- Specify a URL to a template file, likely restricted to pre-defined templates we create in a repository. This would be a template to generate a .ipnyb file.
- Specify values to be filled in to the template - this could be bounding box / particular scenes to be loaded / whatever is appropriate to be dynamically filled into the template with values supplied by the VEDA UI
On the frontend, allow the user to browse the map / operate filters, etc. and then construct the query parameters required to call the backend URL

This is likely a slightly naive outline of the implementation, especially on the backend templating bits and handling security, but I imagine this is roughly how it would work. So, the backend implementation can be fairly agnostic of what either these templates or values to be filled in are, and the requirements there can evolve based on user needs, and would then just require someone to create additional template files, and pass in additional query parameters from the frontend to enable new use-cases.

Will go through existing plugins and extensions and see if there are things we can re-use. A lot of the work here will be on the frontend, figuring out constructing these query parameters based on user actions / selections.

@j08lue who would be good to chat with / coordinate this with on the VEDA UI side?

yuvipanda commented 5 months ago

If you can have 3 separate notebooks that are meaningfully user useful to explore 3 different datasets, we can then look at them to understand what kind of templating would be helpful.

j08lue commented 5 months ago

who would be good to chat with / coordinate this with on the VEDA UI side?

Please just ping the whole team and they can decide who should be part of more closely defining this feature. Could start with @faustoperez in terms of UI/UX design, but since this will require a tight integration with the UI state etc, @sandrahoang686 and @hanbyul-here also need to be involved.

NASA-IMPACT / veda-jupyterhub

Enable "Open in Notebook" for datasets on VEDA #3