jupyter-book / mystmd

Command line tools for working with MyST Markdown.
https://mystmd.org/guide
MIT License
217 stars 64 forks source link

Interactive kernels improvements with Binder and JupyterHub #1235

Open choldgraf opened 6 months ago

choldgraf commented 6 months ago

This is a tracking issue for a number of improvements we'd like to make around the "interact" or "launch" button functionality in MyST. Some parts of it are part of the MVP for Jupyter Book, while others are future improvements to extend functionality. See the child issues for the up to date list.

agoose77 commented 6 months ago

I was just looking at this today after speaking with James @ 2i2c. I think this is probably a theme feature, but I'll update as I figure out more.

choldgraf commented 6 months ago

I considered opening this in the theme as well. The reason I opened it here is because I thought there might need to be some central metadata that the engine exposes, and then themes could use that (eg a way to resolve content paths to binder or jupyterhub URLs, that kinda thing). I agree all the UI and UX would be in a theme

choldgraf commented 5 months ago

UX workflow brainstorm

We had a short brainstorm on this with @jnywong and @gman0909 and I wanted to share notes:

For that reason, a few suggestions for implementing this now:

stevejpurves commented 5 months ago

most of this already exists within current themes certainly, in terms of the plumbing. Currently in notebooks there is a launch button that will bring up the jupyter server plus do the routing for that notebook. It's pretty easy for that to work into any page without having to rethink the theme.

a fully interactive version of their book in a Jupyter session

@choldgraf what do you mean by "session" here? I think you mean in a single user jupyter server instance? rather than meaning that multiple notebooks might share "kernel sessions"

Put the "launch button(s)" on every page, regardless of whether the content is computational.

I would qualify that with if the project: jupyter: true | { ... } is included.

agoose77 commented 5 months ago

@choldgraf I touched upon some of these themes very recently with @jmunroe and @rowanc1.

As I see it, there are two "interact with this book" scenarios:

  1. Stay on the MyST site, execute cells using Thebe
  2. Leave the MyST site, enter into a dedicated compute environment

Right now, Jupyter Book supports both, and MyST will easily achieve parity.

I think there is more that we can do here...:

Swappable Compute

Something that JB can't do right now is permit a random user to use someone else' compute infrastructure. I think that is something we should look into. This would mean extending (1) and (2) to add support for specifying a compute resource (e.g. enter a URL of a BinderHub). In that way, we can delegate the compute infrastructure to someone else (and allow users to recover if e.g. their dedicated Hub is down, but MyBinder is up).

Persistent Thebe Sessions

Presently, Thebe operates at the page-level; loading compute on another page requires starting a new Binder instance. It would be nice if navigating between pages doesn't kill compute immediately. I suspect this is a reasonable amount of work to change (cc @stevejpurves)?

I really think that "Persistant Thebe Sessions" are what the user really wants in order to improve the experience with MyST sites. But given the complexity in delivering that, it might be easy enough to start with a deploy-on-Hub experience that is improved. I think one want to do this would be to spawn a local MyST server on the Hub that then has "local" compute for Thebe. This would provide a "preview" view vs the "edit" view that JupyterLab + JupyterLab-MyST facilitates.

rowanc1 commented 5 months ago

A quick note on thebe: The static HTML export will currently kill the connection on page nav (it is not a SPA, so page refresh kills the connection, although it is fast to reconnect as the connection information is stored in local storage), if hosted in vercel/curvenote/locally, the compute stays alive by default as you navigate around and things get reconnected when you go back to a notebook that you have executed.

stevejpurves commented 5 months ago

Thanks @agoose77 for the summary, there are some misconceptions in there though. Here are some comments

  1. Leave the MyST site, enter into a dedicated compute environment

repeating myself from a previous comment (sorry) but this is already implemented in one place https://github.com/executablebooks/myst-theme/blob/792f9f5c189f56478c321fe6d0a1ea36d60cc93a/packages/jupyter/src/controls/NotebookToolbar.tsx#L101

ok - in the following I am not speaking from the perspective of static sites.

Presently, Thebe operates at the page-level; loading compute on another page requires starting a new Binder instance.

Thebe Sessions are already persistent (by default, this can be disabled if session saving is off) and you can navigate between pages without requesting a new server from binder.

Compute on each page will use the same single user jupyter server as first launched, if a new connection is required at any point because of an unexpected refresh, then thebe will reconnect to the existing server using details from local storage.

You can see this in action - by a starting a session, and from a notebook pressing power, then Launch - keep that jupyterlab instance open on the sessions tab. While you navigate elsewhere in the site and run notebooks or execute figures then you will see new sessions appear on the same server. JupyterLab has a kernel-session-notebook relationship and myst-theme honors this.

If you are not seeing this behaviour on a fully/properly deployed myst-theme then that's probably a bug as it should support it. (just tested that here: https://stevejpurves-simple.curve.space/ it works 🤘)

If you are not seeing this behaviour in a static site built from myst-theme, then that is probably due to the limitations that @rowanc1 pointed out (a point to add to https://github.com/executablebooks/mystmd/pull/1262 maybe).

However, the new/refreshed page should still reconnect to the same server using saved settings, and if you can get on the jupyterlab interface for that server, you'll see that's the case with new sessions appearing for new notebooks started/run.

Something that JB can't do right now is permit a random user to use someone else' compute infrastructure.

We've talked about this before at Curvenote and I love the idea of a user/reader/visitor being able to add their compute details on the page too!

stevejpurves commented 5 months ago

@agoose77 I got focussed on the "new binder instance" parts of what you were saying and missed the importance of this sentence:

It would be nice if navigating between pages doesn't kill compute immediately.

so two more things:

(1) I don't think the compute is killed, more that the UI does not have a way to re-establish the correct state. I have a proposal for state management upgrade in my head that would see this improved a lot and opens up a lot of issues for improved state management both in SPA and maybe even with the downgraded static exports.

(2) check out this behaviour:

  1. visit https://stevejpurves-simple.curve.space/
  2. click power on the first figure, then run
  3. once that is done, navigate to "The Notebook"
  4. click the power button
  5. Navigate back to the index page and we've lost the UI state, it looks like nothing has computed and there is no connection :(

ok next

  1. refresh the page
  2. visit "The Notebook", click on power and run it....
  3. navigate "The Matplotlib Notebook", click on power and run it...
  4. navigate between those two notebooks and 😍 UI state is preserved 🤘

So this is because all the other pages, except the index page are on the same route but when we visit the index page that's akin to a refresh.... so revamping the state management is 1+ weeks for me/someone-with-knowledge-of-current-myst-theme-compute-internals, maybe a few weeks end to end with review testing etc...

but we might be able to fix the route issue in (2) relatively quickly (~1 day), I've done it elsewhere when using thebe before iirc. I'll open an issue.

choldgraf commented 5 months ago

Hey all - could we keep discussions here focused on the "leaving the book and going to an interactive session in a JupyterHub / Binder"? I feel that this is a very different use-case from Thebe, and is what I was mostly thinking about when I opened this issue. In Jupyter Book, the majority of the "launch buttons" are ones that take you to the Binder or JupyterHub interface, not thebe.

stevejpurves commented 5 months ago

@choldgraf reading up on nbgitpuller is seems like that is oriented at getting an entire git repo onto the hub? So I am a bit confused by:

Instead most readers want a "fully interactive version of their book in a Jupyter session" where they can continue to navigate pages without going back to the static book.

As the repo would (potentially) have all notebooks used to create the book. Unless jb1 is doing something subtly different with nbgitpuller that means only a single file is sent over (and potentially this file is sent form the static site, rather than from github)? wondering if you know off the top of your head, and what would be a good public book (with public hub) to look at to understand this better?

kafitzgerald commented 5 months ago

Apologies if this is duplicative and/or the wrong location, but I think this is what we're seeing in working through a demo with some of the Project Pythia content.

For instance, with the older JupyterBook content (example) we launch to a JupyterHub with the notebook of interest open, but with the newer MyST-MD version (corresponding example) the JupyterHub opens at the top level dir of the repository. From there you need to navigate to the notebook you launched from. It'd be great if the launch button had behavior similar to the older version of JupyterBook.

choldgraf commented 6 days ago

This conversation went in a few different directions, and I think it also moved beyond the basic launch functionality I was describing at the top. I read through the comments and created three child issues that I think are non-overlapping and can be addressed separately.

Image

I'll move the relevant content from the issue body into each, and we can take discussion over there to finish them. I'm going to remove this one from the MVP board, since it's now more like a high-level initiative to improve a few aspects of launches (I'd be fine closing it too if people want). I'm going to add the "basic JupyterHub/Binder launch button functionality" to the MVP because that'll be a regression if we don't address it.

I also want to point out a discrepancy in vocabulary here that we should resolve. I've described what I think is going on here: