NASA-IMPACT / veda-jupyterhub

VEDA JupyterHub technical planning and documentation
0 stars 1 forks source link

Should we promote notebooksharing.space in VEDA/GHG Hubs? #33

Open j08lue opened 1 month ago

j08lue commented 1 month ago

Description

We have encountered a couple of cases where users or contributors to work on VEDA/GHG Center wanted to share a notebook with collaborators and were looking for a good way.

Examples

The current workflow would be to download the notebook to share or copy its JSON into a GitHub Gist or to https://notebooksharing.space - if users even know about either of these options.

It would be great if it was a lot easier for users to share notebooks from VEDA/GHG Hub and learn/discover how to do that from the interface, for example through a JupyterLab extension for notebook.space, if that exists.

We also discussed the need for a place to gather shared user notebooks and present them with some kind of VEDA/GHG Center attribution, if the work was supported by either of these. But that is perhaps material for a separate ticket.

Acceptance Criteria:

yuvipanda commented 1 month ago

Just to note, I assume you mean https://notebooksharing.space/ :)

batpad commented 1 month ago

@j08lue thanks for this ticket! I made a minor correction - notebook.space -> notebooksharing.space .

Linking existing ticket around sharing and discoverability: https://github.com/NASA-IMPACT/veda-jupyterhub/issues/4

I had meant to ticket this earlier, but got lost in some of the technical details, so thank you for surfacing this.

Broadly, yes! I think it is valuable to build a better integration with notebooksharing.space - once we have this working and can validate use-cases, we can consider IF we need to run our own instance.

My suggestions for steps here would be:

I do think adding a "Publish to notebooksharing" menu option or so in the JupyterLab interface would be a great start to make this more discoverable and easier for users to do, and I'd like to ticket that out.

There's also a few additional features on notebooksharing.space that we should work on:

I would advocate continuing to use notebooksharing.space for now and evaluating use-cases and workflows. It is, of course, open source software, so we can decide to run our own instance if we needed to for some reason, but I'd carefully evaluate before going down that path. And it's possible that we can do some sort of intermediary solution where we can add a "UI skin" with some branding, etc. but still pull pull in notebooks from notebooksharing.space in an iframe. I think we'll have a better sense of what that path forward should be once we see how people are using this a bit more.

@yuvipanda - if you can point me in the right direction for the JupyterLab extension, I can ticket that out separately and try and schedule that work.

Thanks again for the ticket @j08lue ! 🚀

batpad commented 1 month ago

Some existing, relevant work:

I think my recommendation for work here would be:

We also discussed the need for a place to gather shared user notebooks and present them with some kind of VEDA/GHG Center attribution, if the work was supported by either of these. But that is perhaps material for a separate ticket.

To be able to to do ^, I would recommend we first implement the ability to easily embed notebooks on nbss as iframes: https://github.com/yuvipanda/notebooksharing.space/issues/61 - I think it should then be reasonably straightforward to build a viewer with some attribution and additional features + allow users to login, "manage" and categorize their notebooks, etc. I think if we can manage to get the JupyterLab extension and the iframe embedability done soon, we can plan toward this.

wildintellect commented 2 days ago

A few notes from usage this quarter:

Example: https://notebooksharing.space/view/c71d77331b51cb2a321ad582bb58af2be1e258524bf4ee8fe09e16708bf4c7e3#displayOptions=

kylebarron commented 2 days ago

The data encoded in the notebook is not compressed in any way

To clarify, all data stored by Lonboard is compressed as Parquet, but then those compressed bytes are encoded as base64 inside the JSON, which makes it ~33% larger.

kylebarron commented 2 days ago
  • To ensure it's rendered correctly users currently have to manually do an nbconvert before uploading python -m nbconvert GEDI_InteractiveNotebook.ipynb --to ipynb --stdout --execute GEDI_InteractiveNotebook_executed.ipynb

That's a bug in JupyterLab and tracked in this issue: https://github.com/jupyterlab/jupyterlab/issues/16264

yuvipanda commented 2 days ago

User are very excited about this method of sharing a read-only view of a rendered notebook where interactive widgets work (unlike github).

This makes me so happy, @wildintellect :) I don't have too many stats, but right now there are about 3809 notebooks in total on notebooksharing.space. I don't really have any other usage stats unfortunately.

For auth, the thing to do here would be to simply tie this into jupyterhub itself, and use that as the auth provider. That would mean nothing needs to change regardless of upstream auth work, and it would mean that the nbss client as well as the jupyterlab extension would 'just work' because they'll be able to piggyback on the jupyterhub auth. So we'd have a 'one instance of nbss per jupyterhub', although perhaps we instead want a bigger instance for everyone?

Regardless, I've intentionally not added auth to this right now because of the fact that it runs arbitrary user JS (which is why widgets work). I've sandboxed it a bit with an iframe, but I don't trust my understanding of web security enough to know if that's 'safe'. This is the primary reason GitHub doesn't have interactive widgets I think. @batpad probably has a better understanding of the security situation here.

Regardless, the current setup is fairly stable and runs without issues and I'm happy to keep running it :D I think the next useful thing for us to work on here would be the jupyterlab extension, as well as the jupyterlab bug that requires the commandline execution for lonboard!

wildintellect commented 2 days ago

@yuvipanda but with jupyterhub auth, how would a user go to the site and be able to pull up a list of their posted notebooks over time? So I'm thinking more about a unique userid that can be associated to materials.