2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 62 forks source link

[Request deployment] New Hub -- Indian Ocean Hackweek 2023 #2821

Closed colliand closed 1 year ago

colliand commented 1 year ago

The GitHub handle of the community representative

eeholmes (EDIT: fixed typo)

Hub important dates

Hub Authentication Type

GitHub (e.g., @mygithubhandle)

First Hub Administrators

@eeholmes

[GitHub Auth only] How would you like to manage your users?

Allowing members of a specific GitHub organization (EDIT: specific team, see below)

[GitHub Teams Auth only] Profile restriction based on team membership

EDIT: https://github.com/orgs/Hackweek-ITCOocean/teams/itcoocean-hackweek-2023

Hub logo image URL

EDIT: https://user-images.githubusercontent.com/2545978/253672085-ec5ca6fb-147b-4fcd-87f1-431108b62558.png

Looks like:

Hub logo website URL

https://incois.gov.in/ITCOocean/index.jsp

Hub user image GitHub repository

pending

Hub user image tag and name

pending

Extra features you would like to enable

(Optional) Preferred cloud provider

AWS

(Optional) Preferred cloud region

us-west-2 (shared 2i2c cluster: 2i2c-aws-us)

(Optional) Billing and Cloud account

None

Other relevant information to the features above

Research hub on a shared cluster (likely hosted in AWS us-west-2). Additional details are being gathered through async g-doc running notes with @eeholmes. Thanks in advance to @jmunroe for assistance with this while @colliand is away on vacation.

This hub will run from early August through mid-October.

When filling in operated by, funded by, etc fields (https://github.com/2i2c-org/infrastructure/issues/2821#issuecomment-1636388679):

"Openscapes" should be "ITCOocean" with link to https://incois.gov.in/ITCOocean/index.jsp

Funded by: Should be "ITCOocean Hackweek Team" with no url (EDIT: Should be ESIP instead)

Tasks to deploy the hub

github-actions[bot] commented 1 year ago

Hey @eeholme and @colliand! 👋 I noticed there is still pending information about the new hub deployment. Can you please help us fill it in?

The information pieces still missing, are: - hub logo image url
- hub logo website url
- hub user image github repository
- hub user image tag and name
- extra features you would like to enable

Details about each of them can be found in the top comment. But if you have questions about any of them, please ping the 2i2c/engineering team and they will help you.

After the form in the top comment is filled in, an engineer will be assigned and will start deploying the new hub 🚀. Thank you!

eeholmes commented 1 year ago

A few corrections

The GitHub handle of the community representative

eeholmes was missing "s"

Hub important dates

  • Target Start Date: 2023-08 -01 (likely not possible until a bit later given current work load)
  • Required Start Date: 2023--08-08
  • Important dates for usage: 2023-09-10 through 2023-09-22. This is when the Hackathon will take place.

Hub Authentication Type

GitHub (e.g., @MyGitHubHandle)

First Hub Administrators

@eeholmes

Not all GitHub organization members. I would like hub access restricted to this team on the organization https://github.com/orgs/Hackweek-ITCOocean/teams/itcoocean-hackweek-2023

[GitHub Auth only] How would you like to manage your users?

Allowing members of a specific GitHub organization Only members of the itcoocean-hackweek-2023 team

[GitHub Teams Auth only] Profile restriction based on team membership

Yes, this team https://github.com/orgs/Hackweek-ITCOocean/teams/itcoocean-hackweek-2023

Hub logo image URL

incois-logo

Hub logo website URL

https://incois.gov.in/ITCOocean/index.jsp

Hub user image GitHub repository

pending

Hub user image tag and name

pending

Extra features you would like to enable

  • [ ] Dedicated Kubernetes cluster
  • [ ] Scalable Dask Cluster

(Optional) Preferred cloud provider

None

(Optional) Preferred cloud region

us-west-2

(Optional) Billing and Cloud account

None

Other relevant information to the features above

Research hub on a shared cluster (likely hosted in AWS us-west-2). Additional details are being gathered through async g-doc running notes with @eeholmes. Thanks in advance to @jmunroe for assistance with this while @colliand is away on vacation.

This hub will run from early August through mid-October.

Tasks to deploy the hub

  • [ ] 1. Deploy information filled in above
  • [ ] 2. Engineer who will deploy the hub is assigned
  • [ ] 3. If using GitHub Orgs/Teams Auth, Engineer is given Owner rights to the org to set this up.
  • [ ] 4. Initial Hub deployment PR
  • [ ] 5. Administrators able to log on -> Hub now in steady-state
eeholmes commented 1 year ago

Re landing page for hub. If it will look similar to the Openscapes hub,

image

"Openscapes" should be "ITCOocean" with link to https://incois.gov.in/ITCOocean/index.jsp

Funded by: Should be "ITCOocean Hackweek Team" with no url

colliand commented 1 year ago

Thanks @eeholmes!

eeholmes commented 1 year ago

@colliand I am working on the hub image questions. I need to update our Python image and build our R image.

eeholmes commented 1 year ago

Hub user images

There are 4. Two for Python, Two for R On DockerHub

Note I think the Geospatial R with SDM has a problem as my kernel is not spinning up when I select that one. but I will fix that later.

Here is the current config file for our daskhub that my team is using during development of the course content. This is a bit of a hack but is mostly doing what we want, which is allowing users to pick different images. Note the machine sizes are not how we want it so ignore that part. Something more like what Openscapes has or whatever you think is best. Currently we keep maxing out the memory when working with big data sets.

Regarding the extra storage. We have a 1 TB shared drive where we have the data and notebooks that the students will be using. It should be read only for the students but read/write for the admins.

Here are the content developers who will need to write to the shared drive: https://github.com/orgs/Hackweek-ITCOocean/teams/hackweek-2023-team/members

jupyterhub:
  hub:
    config:
      GitHubOAuthenticator:
        client_id: xxxxx
        client_secret: xxxx
        oauth_callback_url: xxxx
        allowed_organizations:
          - nmfs-opensci:DaskHub
        scope:
          - read:org
      JupyterHub:
        authenticator_class: github
      KubeSpawner:
        working_dir: /home/jovyan
    allowNamedServers: true
    networkPolicy:
      enabled: false
    readinessProbe:
      enabled: false
  proxy:
    https:
      enabled: true
      hosts:
        - dhub.opensci.live
      letsencrypt:
        contactEmail: eli.holmes@noaa.gov        
  singleuser:
    defaultUrl: /lab
    # Defines the default image
    storage:
      capacity: 100Gi
      extraVolumes:
        - name: jupyterhub-shared
          persistentVolumeClaim:
            claimName: daskhub-pvc
      extraVolumeMounts:
        - name: jupyterhub-shared
          mountPath: /home/jovyan/shared
    image:
      name: openscapes/python
      tag: f577786
    profileList:
      - display_name: "4 CPU / 32 GB RAM"
        description: &profile_list_description "Start a container with at least a chosen share of capacity on a node of this type"
        default: true
        slug: small
        profile_options:
          image: &profile_options_image
            display_name: Image
            choices:
              python:
                display_name: Geospatial Python
                slug: python
                kubespawner_override:
                  image: openscapes/python:f577786
              iopython:
                display_name: Geospatial Python with tensorflow
                default: true
                slug: python
                kubespawner_override:
                  image: eeholmes/iopython:20230615
              rocker:
                display_name: Geospatial R
                slug: rocker
                kubespawner_override:
                  image: openscapes/rocker:a7596b5
              rocker:
                display_name: Geospatial R with SDM
                slug: rocker
                kubespawner_override:
                  image: eeholmes/iorocker:20230714
        kubespawner_override:
          cpu_limit: null
          mem_limit: null
          node_selector:
            node.kubernetes.io/instance-type: Standard_D8s_v3 
      - display_name: "Tiny: up to 2 CPU / 7 GB RAM"
        description: *profile_list_description
        slug: tiny
        profile_options:
          image: *profile_options_image
        kubespawner_override:
          cpu_limit: null
          mem_limit: null
          storage_capacity: null
          node_selector:
            node.kubernetes.io/instance-type: Standard_DS2_v2
dask-gateway:
  gateway:
    extraConfig:
      idle: |-
        # timeout after 30 minutes of inactivity
        c.KubeClusterConfig.idle_timeout = 1800 
eeholmes commented 1 year ago

Note: we got funding from ESIPfed.org for the hub during the hackweek, so the 'funded by' can say ESIP.

eeholmes commented 1 year ago

Note please use AWS as the cloud provider. I have AWS cloud credits from another source.

jmunroe commented 1 year ago

Hi Eli. I'll wait for @colliand to chime in on this, but I think that if you are using AWS cloud credits, I am not exactly sure how we'll be able support a "shared" cluster. Shared (from 2i2c perspective) usually means that 2i2c pays the cloud costs directly and the invoices the each community proportional to their use of cloud infrastructure. On a shared cluster it makes it challenging to use your AWS credits (only) for the INCOIS hub.

Going with a dedicated cluster is one option. And I wonder if there is a way to assign "AWS credits" to 2i2c so that we can use them to pay for just your portion of the cloud costs. In either case, we'll figure something once you return the week of July 31st. It may also depend on the terms and conditions of the AWS credits that you have.

colliand commented 1 year ago

Correct. With current set up, 2i2c does not have a way to apply credits from AWS granted to Eli to cover shared cluster costs.

eeholmes commented 1 year ago

Correct. With current set up, 2i2c does not have a way to apply credits from AWS granted to Eli to cover shared cluster costs.

Got it. Oh well. I will include cloud computing costs in the proposal I am writing for keeping the hub running after the hackweek.

eeholmes commented 1 year ago

Updated: images Note, I will likely be updating the images somewhat frequently in the run-up to the hackweek however these images have 90% of what we will need.

damianavila commented 1 year ago

FYI, @consideRatio is going to handle the deployment of this new hub request in the next few days.

consideRatio commented 1 year ago

Hi @eeholmes I've tried to capture a summary below on what I'll now work on deploying.

Misc details

Summary updated in https://github.com/2i2c-org/infrastructure/issues/2821#issue-1805368217

Image choices

Machine types

Something more like what Openscapes has or whatever you think is best. Currently we keep maxing out the memory when working with big data sets.

Storage

Regarding the extra storage. We have a 1 TB shared drive where we have the data and notebooks that the students will be using. It should be read only for the students but read/write for the admins.

The users that should have read/write access are defined by the github team in https://github.com/orgs/Hackweek-ITCOocean/teams/hackweek-2023-team/members.

@eeholmes I lack permissions to see members in that team, and I can't use the team name to directly grant the members admin rights because its currently not a feature of jupyterhub/oauthenticator that we rely on.

As a strategy to accomplish this anyhow, I suggest that you manually add the users you want to grant admin membership to via https://itcocean.2i2c.cloud/hub/admin. To make an existing user admin, press the "edit user" button next to the user, and if the user in the team you want to grant admin permission doesn't yet exist, you can add them via https://itcocean.2i2c.cloud/hub/admin#/add-users --- when doing so, use only lower case letters and don't include @ in the usernames

consideRatio commented 1 year ago

Hi @eeholmes!

I'm an engineer at 2i2c and have now setup a JupyterHub available at https://itcocean.2i2c.cloud. There are a few points I'd like your help with and feedback on.

  1. Initial setup of permissions for login Could you as part of attempting to login at https://itcocean.2i2c.cloud press the Grant or Request button for the Hackweek-ITCOocean github organization, similar to the grant button next to jupyterhub seen in this image? image Granting this only needs to be done once by one of you. What we request here is permissions to inspect the members of various teams in the github organization Hackweek-ITCOocean (read:org). Specifically, JupyterHub will check if the user is a member of the GitHub organization's team itcoocean-hackweek-2023 to decide if the github user should be authorized access. Note that if you can't or don't want to press Grant, the login isn't expected to succeed unless your github user's membership of the github team has explicitly been made public ahead of time.

  2. Verification of domain name choice Could you confirm that the domain name itcocean.2i2c.cloud is acceptable among the choices of <anything>.2i2c.cloud? Note that we can also make use of a domain not managed by 2i2c as an alternative to this.

  3. Verification of details Could you check if this view on the login page looks correct?

    image

  4. Confirmation on machine type and default request Users are defaulting to get a guaranteed capacity of 8 GB of memory and 1 CPU on 128 GB and 16 CPU based machines, but are not limited to this. To learn about your memory use, you can look at the footer status bar in jupyterlab. If you expect users on average to end up with more memory usage than 8GB, the machine may run out of memory and the users using the most memory is then kicked out.

    I configured 8 GB of default memory request as it was a bit higher than the 4 and 7 that in the config you provided that ended up running out of memory when working with larger datasets.

    This is how it looks for the user starting a server:

    itcocean

  5. Confirmation if shared storage setup is ok 2i2c provides a setup by default for shared storage that admins can write to and non-admins can only read - is it okay as it is?

    It is a setup where admin users see two folders called shared and shared-readwrite - its the same folder, and shared is always "read only" while the admin's seeing also the shared-readwrite can use that to also write.

  6. An update of image tags? I've configured the images you mentioned, but should I update the image tags once right away?

eeholmes commented 1 year ago

Hi!

2. Verification of domain name choice

Can you change to itcoocean with 2 "o"'s. The institute name is ITCOocean?

3. Confirmation on machine type and default request

Can you default to something a bit smaller? The admins are maxing out the memory as we are setting up, but the participants in our courses and hackweek will not normally be doing what we are doing. Here is the default for the openscapes.2i2c.cloud hub. This should be good. Note I work with Openscapes and much of our tutorial testing has been on their hub. I am not sure if the fine-grained drop-down re memory is needed? I leave that up to your judgement.

image

4. Confirmation if shared storage setup is ok

Yes shared drive set-up is good. read-only for participants and read/write for admins.

Question, is it possible for participants to have a second smaller read-write drive that is separate from the admin one? We would like to make it easy for teams to collaborate on shared data and notebooks without having to push that to GitHub.

5. An update of image tags?

The image tags are correct for now.

Can you change the default image to "eeholmes/iorocker:20230714 - Name: Geospatial R with SDM"

consideRatio commented 1 year ago

Can you change to itcoocean with 2 "o"'s. The institute name is ITCOocean?

Ah nice catch, sorry for the mistake - this is now fixed!

Note I work with Openscapes and much of our tutorial testing has been on their hub. I am not sure if the fine-grained drop-down re memory is needed? I leave that up to your judgement.

Ah then let's leave the drop down as a failsafe if more memory than 1GB on average ends up relevant, but like you suggest let users default to a 1GB capacity on a small server node like is the default for openscapes.

Note that you could have been using more than 1GB of memory when working in openscapes without issues, but if all users in the workshop use 1.2GB of memory there will be issues. With that in mind, it can be good to have an idea on what memory consumption to expect. If users on average will consume <1GB, what we have is perfectly fine even when 60 users are allocated to ~2 nodes/machines.

consideRatio commented 1 year ago

Question, is it possible for participants to have a second smaller read-write drive that is separate from the admin one? We would like to make it easy for teams to collaborate on shared data and notebooks without having to push that to GitHub.

Yes - we can add a "shared-public" folder that everyone has read/write access to for example. I've not done that right now, but I'll get it done monday the latest.

Can you change the default image to "eeholmes/iorocker:20230714 - Name: Geospatial R with SDM"

Done!


Result at...

https://itcoocean.2i2c.cloud

image

consideRatio commented 1 year ago

Question, is it possible for participants to have a second smaller read-write drive that is separate from the admin one? We would like to make it easy for teams to collaborate on shared data and notebooks without having to push that to GitHub.

@eeholmes there is now a shared-public folder!

Visit https://itcoocean.2i2c.cloud/services/configurator/ to decide if you wish to start using a specific user interface (Classical Jupyter, JupyterLab, RStudio).

With this, I think we may be done with the initial setup. Does this look good to you?

eeholmes commented 1 year ago

I got in aok! A couple minor things.

1) Can the default be "lab" instead of "tree", so this shows when logging in by default

image

2) Can we have a large machine as an option like one has for openscapes.2i2c.cloud?

image
consideRatio commented 1 year ago

Can the default be "lab" instead of "tree", so this shows when logging in by default

Yes! This is something you can also toggle for all users being a JupyterHub admin via https://itcoocean.2i2c.cloud/services/configurator/.

I've now checked JupyterLab to be the desired user interface for servers starting up going onwards.

Can we have a large machine as an option like one has for openscapes.2i2c.cloud?

Yes! Did you mean that you'd like to see users presented with options labelled as Small and Medium below like in your image, or also the option labelled as Large also?

image

eeholmes commented 1 year ago

Yes people who log in should see Small and Medium. I don't think we will need Large for the course and I am guessing that would use a lot of cloud $$ if they selected that.

consideRatio commented 1 year ago

Yes people who log in should see Small and Medium.

:+1:

[...] I am guessing that would use a lot of cloud $$ if they selected that.

Yes, the cost scales quite linearly with the amount of CPU of each started machine. If users spread out on multiple machines when they could fit on one would also incurr more cost. Also, for each machine started a user may need to wait for machine startup, so it can be good to not provide too many options.

consideRatio commented 1 year ago

@eeholmes the named options Small / Medium are now presented and jupyterlab should startup automatically by default!

eeholmes commented 1 year ago

Quick question. Is uploading to 2i2c JupyterHubs throttled? I am trying to upload a 18G data file to our shared folder and it is stuck at 1.5G/sec upload speed but I can upload to other sites (Dropbox) at 25G/sec. Thanks! Note I am on oceanhackweek.2i2c.cloud at the moment but I assume if there is throttling, it is a general feature.

consideRatio commented 1 year ago

Is uploading to 2i2c JupyterHubs throttled? I am trying to upload a 18G data file to our shared folder and it is stuck at 1.5G/sec upload speed but I can upload to other sites (Dropbox) at 25G/sec.

Its not actively throttled by 2i2c explicit configuration, but I think the machine type chosen will influence this where smaller machine types is bounded in their network capacity.

openscapes is running on AWS, and the machine types:

By googling I found this page describing 10 gigabits / sec for r5.xlarge and r5.4xlarge, but higher for r5.16xlarge: https://aws.amazon.com/ec2/instance-types/r5/#Product_Details.

eeholmes commented 1 year ago

Ah ok. I am on the oceanhackweek.2i2c.cloud JupyterHub and I think they have a pretty small VM. I think smaller than r5.xlarge.

Which brings up another question, is there a way to show the Large option but only for the admins? That way the admins can get on that if needed, but the course participants won't be able to select that option. Thx!

eeholmes commented 1 year ago

Also my upload speed is 1.5M/sec not 1.5G/sec. So pretty slow. There is some type of throttling going on, perhaps on my side, that is preventing higher upload speeds.

eeholmes commented 1 year ago

This suggests maybe change something the the ssh config? Scroll to bottom. https://serverfault.com/questions/1032792/uploading-folder-from-local-directory-to-vm-instance-jupyterlab-folder-too-slow

consideRatio commented 1 year ago

Which brings up another question, is there a way to show the Large option but only for the admins? That way the admins can get on that if needed, but the course participants won't be able to select that option.

Yes! I can be set that up quickly if you provide me a name of a github team of users to be allowed seeing the Large 64 CPU 512GB machine type option.


Also my upload speed is 1.5M/sec not 1.5G/sec. So pretty slow. There is some type of throttling going on, perhaps on my side, that is preventing higher upload speeds.

Could you open a dedicated support ticket about networking speeds if you have issues with them via https://docs.2i2c.org/support/ ?

eeholmes commented 1 year ago

Hi, Can you add a few more admins to the hub? These should have access to shared-readonly so that they can add to the shared drive for the participants:

GitHub usernames: aditimodi, swarnali1, smitha-br, nim-it

consideRatio commented 1 year ago

I'm on vacation writing from my mobile phone @eeholmes, could you contact support about this to ensure someone in 2i2c can help you with this?

damianavila commented 1 year ago

I am gonna declared this new hub deployment completed with the expectation of any remaining issues being managed and resolved via the support process. Thanks!