jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.57k stars 390 forks source link

Allow building the image without needing to launch it #1647

Closed consideRatio closed 1 year ago

consideRatio commented 1 year ago

(PR written and made by @GeorgianaElena, it was just initialized by @consideRatio to get a branch to reference)

This PR adds a feature that offers the option of an API only mode, and in this mode, it allows opting-out of launching directly after build.

It adds:

flowchart LR
  subgraph API only mode enabled
     direction TB
       id1{{false}}
       id2{{false}}
       id3{{false}}
       id4{{true}}
       id5{{true}}
       id6{{true}}
  end

  subgraph `build_only` query param
     direction TB
     id1 ---|and|id11{{missing}}
     id2 ---|and|id12{{false}}
     id3 ---|and|id13{{true}}
     id4 ---|and|id14{{missing}}
     id5 ---|and|id15{{false}}
     id6 ---|and|id16{{true}}
  end

   subgraph Outcome
     direction TB
     id11 ==> id21{{OK, image will be launched after build}}
     id12 ==> id22{{OK, image will be launched after build}}
     id13 ==> id23{{ERROR, building but not launching is not permitted when API only mode is not enabled}}
     id14 ==> id24{{OK, image will be launched after build}}
     id15 ==> id25{{OK, image will be launched after build}}
     id16 ==> id26{{OK, image will NOT be launched after build}}
  end

It also updates the auth tests for consistency to expect a more descriptive pytest request parameter value in order for the app fixture to load the extra auth configuration. This is for consistency with the build only extra config that was added in this PR.

Todo

Background

This is meant as a building block for the https://github.com/2i2c-org/binderhub-service project. In brief, its positioning itself to be relevant specifically for situations when a binderhub UI is relevant together with a JupyterHub behind authentication and that also provides user home folder storage.

GeorgianaElena commented 1 year ago

I think I understand @yuvipanda. Thanks for the explanation. Can you double check the following points to make sure I got it right, please?

Double check list

  1. When require_no_launch = False, then a mybinder scenario is considered, meaning that the query can be:

    • missing, in which case everything remains unchanged
    • launch=true, in which case everything remains unchanged
    • launch=false, in which case error is returned, saying 'building but not launching is not permitted'
  2. When require_no_launch = True, then a service-like scenario is considered, meaning that the query can be:

    • missing, in which case return an error
    • launch=true, in which case we launch?? (don't sure about this one. Do we still want an and situation herre also?)
    • launch=false, in which case we don't launch

Prior discussions

Currently, the query parameter just overrides the traitlet - so the image gets built and not launched if either the traitlet or the query param is true.

I believe @consideRatio made a point about why this (overriding through the query the traitlet) might be useful, but don't remember what it was 😬 @consideRatio wdyt?

Naming

Thinking more about the naming, I realize that having a traitlet with a negation in its name, and then the query without, it's kind of confusion. Also, checking if no require_no_launch is also confusing I believe.

What do you think about a:

yuvipanda commented 1 year ago

launch=true, in which case we launch?? (don't sure about this one. Do we still want an and situation herre also?)

I think in this case, we return an error - launch=false is required. This allows the binderhub to exist without requiring a JupyterHub it has launch rights on to be configured.

I like your proposed build based naming scheme, we can go with that. Thank you for thinking that through!

GeorgianaElena commented 1 year ago

@yuvipanda, I've updated the PR if you want to take another look

GeorgianaElena commented 1 year ago

Thanks for the feedback @consideRatio ✨ it helped me simplify and correct the logic a bit more while also keeping the old log messages. I've also managed to get an understanding of the current tests and added some for the new functionality.

This is why I marked this as ready for review.

Note that it still lacks documentation but I plan to add that once there's agreement on the implementation.

P.S. thanks @colliand for showing off https://github.com/mermaid-js/mermaid in another PR. This was used to create the diagram in the top comment https://github.com/jupyterhub/binderhub/pull/1647#issue-1638915163 :D

consideRatio commented 1 year ago

Thank you for working this @GeorgianaElena!!! I'll look at this tomorrow/monday morning!

GeorgianaElena commented 1 year ago

After testing this out with @consideRatio yesterday, we concluded that even though we decided that it is out of scope of this PR to support a new UI for the build only option, it is confusing to still have the classical build + launch available.

When clicking the launch button, that will imediatlly fail with:

Screenshot 2023-04-28 at 11 08 02

With the last commit, https://github.com/jupyterhub/binderhub/pull/1647/commits/b82758ee00ace584ab53607806ba01127297fa44, when require_build_only is set, the classical UI is deactivated and instead a 204 No content page is rendered using the error template. It looks like below:

Screenshot 2023-04-28 at 10 41 01

Does something like this makes sense?

GeorgianaElena commented 1 year ago

Failures seem to be happening on the main branch as well https://github.com/jupyterhub/binderhub/actions/runs/4829259002/jobs/8604077929 so not sure which if any are related to the PR

consideRatio commented 1 year ago

@GeorgianaElena I've been debugging this for quite some time now, I'm drawing blank.

Apparently the hub pods launching are stuck pending without k8s Events associated with them. That indicates to me they aren't getting scheduled, not even failing to get scheduled to a node I think - but why?

GeorgianaElena commented 1 year ago

Just added a commit to not register any handlers for / and /v2 when require_build_only is True instead to keep it simple and secure as per discussion with @consideRatio

GeorgianaElena commented 1 year ago

@consideRatio, I believe I've addressed everything in your latest feedback. Thank you!

consideRatio commented 1 year ago

@minrk and @manics I've requested your review for this PR which is done by @GeorgianaElena (in my fork because I wanted a placeholder branch), and reviewed by me and @yuvipanda so far.

We want to request your review to ensure that this is considered reasonable from jupyterhub team-members outside 2i2c. The key use of this feature is to enable binderhub to run in a mode where it is decoupled from jupyterhub, as deployed via https://github.com/2i2c-org/binderhub-service.

consideRatio commented 1 year ago

Thank you for reviewing @minrk!!! :heart: :tada:

I realize that makes the config required for build-only more complex, [...], but there are some advantages to this, because it’s explicit that 3 separate behaviors are changing.

I've come to appreciate these more explicit config choices greatly, I think overall they are easier to implement, document, understand, maintain, and finally build other functional upon.

minrk commented 1 year ago

I think the main distinction for me is if we view the build-only-api-only as an all-together "mode" that's likely to stay as the only alternative to the 'default' UI + always-launch, where one flag does make sense, vs the mutual exclusive states being a potentially temporary limitation.

yuvipanda commented 1 year ago

Here's a demo of this in action, in a profile_list page!

https://github.com/jupyterhub/binderhub/assets/30430/ede5e8a5-b705-4b81-8279-1ad913b0a128

GeorgianaElena commented 1 year ago

@minrk, I've added a few commits based on your feedback above (thank you),. I've implemented a combination of your suggestions above! What do you think about:

  • give the config a name that indicates how much is changing, e.g. build_only_api_only or disable_launch_and_ui
  • accept the only allowed value when unspecified in both cases, rather than requiring opt-in to the only accepted value (I’m less sure about this, especially in more detailed proposal below)
GeorgianaElena commented 1 year ago

Thank you very much @minrk for the review and suggestions. I've added a commit that addresses them.

Some tests seem to be failing now though, but the same are failing for the main branch as well, so I don't think they are related to this PR?

minrk commented 1 year ago

Yes, failures appear to be helm related, not related to this PR (one image pull failure and one memory limit issue). Going to merge this one!