2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 57 forks source link

[Request deployment] New Hub: Jack Eddy Symposium #3166

Closed colliand closed 9 months ago

colliand commented 10 months ago

The GitHub handle of the community representative

@colliand

Hub important dates

Hub Authentication Type

GitHub (e.g., @mygithubhandle)

First Hub Administrators

[GitHub Auth only] How would you like to manage your users?

Manually, by adding specific GitHub handles in the JupyterHub Admin panel

[GitHub Teams Auth only] Profile restriction based on team membership

No response

Hub logo image URL

https://cpaess.ucar.edu/sites/default/files/styles/extra_large/public/2023-08/EddySymposium-900x400.jpg?itok=8qG7Dqi3

Hub logo website URL

https://cpaess.ucar.edu/meetings/4th-eddy-cross-disciplinary-symposium

Hub user image GitHub repository

Pangeo (possibly also Heliocloud as another option)

Hub user image tag and name

pending; likely latest pangeo image

Extra features you would like to enable

(Optional) Preferred cloud provider

AWS

(Optional) Preferred cloud region

us-west2

(Optional) Billing and Cloud account

None

Other relevant information to the features above

This hub will be used for the 4th Jack Eddy Symposium. The event will take place October 29 -- November 3, 2023 in Golden, Colorado. This event will focus on open science, sun-climate interactions, space weather, and exoplanets. Talks will take place in the morning with hackathon-style activities during the afternoons. @colliand will be in attendance at the event. Funding for this hub and the Symposium flows from NASA's Living with a Star Program and UCAR/CPAESS.

Tasks to deploy the hub

damianavila commented 9 months ago

I assigned @consideRatio to deploy this hub. I would recommend deploying it close to the end of the iteration (first days of October).

Previous Jack Eddy hub:

@colliand, please provide any further information we might need to deploy this in the next few days. For example:

  1. shared or dedicated cluster?
  2. is there any reason to be now in AWS given the fact the previous deployment lives under GCP (see links referenced above)?

@consideRatio, feel free to ask any additional questions to @colliand (and the other community representatives) you might need to effectively perform this deployment.

colliand commented 9 months ago

I think GCP is fine. The Symposium is an opportunity for 2i2c to experiment with new stuff. There will be communities working on space weather, climate, and exoplanets. Allowing these groups to customize environments may be helpful but is not necessary. I'm available for questions @consideRatio!

colliand commented 9 months ago

Is there progress or are there any blockers on this deployment request?

damianavila commented 9 months ago

Not particular blockers, although other high-priority operational issues took precedence. @consideRatio, can you please make a priority to deploy this early next week? Thanks!

damianavila commented 9 months ago

Reassigned to @GeorgianaElena to deploy during this cycle (as planned above).

consideRatio commented 9 months ago

Having reflected a bit on this already, here are some details and planning I had done before handing it was handed over @GeorgianaElena. Feel free to deviate if, but I hope it can help reduce the work a bit.

Use the shared 2i2c-aws-us cluster, deploy a daskhub, and use the domain name eddy.aws.2i2c.cloud. The full name Jack Eddy isn't used in the homepage https://cpaess.ucar.edu/meetings/4th-eddy-cross-disciplinary-symposium so its probably fine to just go with eddy..

  custom:
    2i2c:
      add_staff_user_ids_to_admin_users: true
      add_staff_user_ids_of_type: "github"
    homepage:
      templateVars:
        org:
          name: Jack Eddy Symposium
          url: https://cpaess.ucar.edu/meetings/4th-eddy-cross-disciplinary-symposium
          logo_url: https://cpaess.ucar.edu/sites/default/files/styles/extra_large/public/2023-08/EddySymposium-900x400.jpg?itok=8qG7Dqi3
        designed_by:
          name: 2i2c
          url: https://2i2c.org
        operated_by:
          name: 2i2c
          url: https://2i2c.org
        funded_by:
          name: ""
          url: ""
          custom_html: <a href="https://science.nasa.gov/heliophysics/programs/living-with-a-star/">NASA's Living with a Star program</a> and <a href="https://cpaess.ucar.edu/">UCAR/CPAESS</a>
  hub:
    config:
      JupyterHub:
        authenticator_class: cilogon
      CILogonOAuthenticator:
        oauth_callback_url: https://eddy.aws.2i2c.cloud/hub/oauth_callback
        allowed_idps:
          http://github.com/login/oauth/authorize:
            username_derivation:
              username_claim: preferred_username
      OAuthenticator:
        # WARNING: Don't use allow_existing_users with config to allow an
        #          externally managed group of users, such as
        #          GitHubOAuthenticator.allowed_organizations, as it breaks a
        #          common expectations for an admin user.
        #
        #          The broken expectation is that removing a user from the
        #          externally managed group implies that the user won't have
        #          access any more. In practice the user will still have
        #          access if it had logged in once before, as it then exists
        #          in JupyterHub's database of users.
        #
        allow_existing_users: True
      Authenticator:
        # WARNING: Removing a user from admin_users or allowed_users doesn't
        #          revoke admin status or access.
        #
        #          OAuthenticator.allow_existing_users allows any user in the
        #          JupyterHub database of users able to login. This includes
        #          any previously logged in user or user previously listed in
        #          allowed_users or admin_users, as such users are added to
        #          JupyterHub's database on startup.
        #
        #          To revoke admin status or access for a user when
        #          allow_existing_users is enabled, first remove the user from
        #          admin_users or allowed_users, then deploy the change, and
        #          finally revoke the admin status or delete the user via the
        #          /hub/admin panel.
        #
        admin_users:
          - dan800 # Dan Marsh
          - rmcgranaghan # Ryan McGranaghan
  singleuser:
    profileList:
      # image options:
      #   - pangeo/pangeo-notebook (https://github.com/pangeo-data/pangeo-docker-images)
      #   - (?) panhelio/helio-notebook (https://git.smce.nasa.gov/heliocloud/heliocloud-docker-images, non-archived non-public location at https://gitlab.smce.nasa.gov/heliocloud/runtimes), these images may not be public, more information is required about this if they are to be used
      #   - Also unlisted_choice for free form image choice
      # server options:
      #     resource allocation choices from a r5.4xlarge node, excluding the
      #     r5.xlarge node to help reduce startup times. This hub seems to be
      #     setup for an event between Oct 29 to Nov 3, 2023, and with workloads
      #     in this space its a good chance that they are going to use notable
      #     amounts of ram. To avoid putting only two users requesting ~16 GB on
      #     a ~32 GB node, we should instead exclude the r5.xlarge option and
      #     put such requests also on a r5.4xlarge node where at least fit eight
      #     users per node.
      #
GeorgianaElena commented 9 months ago

Thank you very much @consideRatio for this plan! It's really useful <3

I am confused about the cloud provider to go with though.

Is it AWS or is it GCP? The previous deployment (https://github.com/2i2c-org/infrastructure/issues/1329) seemed to have been deployed in GCP because this is where the data was. @colliand do we have information about where the data lives? Otherwise I was thinking to default to GCP, like the previous one.

colliand commented 9 months ago

GCP is fine. Thanks @GeorgianaElena. Proximate access to data like Pangeo will be useful.

GeorgianaElena commented 9 months ago

Update

There is a hub now running at https://jackeddy.2i2c.cloud that people can try on.

It has the following features enabled:

colliand commented 9 months ago

Thanks @GeorgianaElena and everyone else who contributed to this on-time deployment!

GeorgianaElena commented 9 months ago

Then I will close this issue as the hub is now ready and let's continue the discussion on freshdesk for possible improvements requests to it.

Please feel free to reopen if you disagree.