Closed colliand closed 1 year ago
Hey @eeholme and @colliand! đ I noticed there is still pending information about the new hub deployment. Can you please help us fill it in?
The information pieces still missing, are:
- hub logo image url
- hub logo website url
- hub user image github repository
- hub user image tag and name
- extra features you would like to enable
Details about each of them can be found in the top comment. But if you have questions about any of them, please ping the 2i2c/engineering
team and they will help you.
After the form in the top comment is filled in, an engineer will be assigned and will start deploying the new hub đ. Thank you!
A few corrections
The GitHub handle of the community representative
eeholmes was missing "s"
Hub important dates
- Target Start Date: 2023-08 -01 (likely not possible until a bit later given current work load)
- Required Start Date: 2023--08-08
- Important dates for usage: 2023-09-10 through 2023-09-22. This is when the Hackathon will take place.
Hub Authentication Type
GitHub (e.g., @MyGitHubHandle)
First Hub Administrators
@eeholmes
Not all GitHub organization members. I would like hub access restricted to this team on the organization https://github.com/orgs/Hackweek-ITCOocean/teams/itcoocean-hackweek-2023
[GitHub Auth only] How would you like to manage your users?
Allowing members of a specific GitHub organizationOnly members of the itcoocean-hackweek-2023 team[GitHub Teams Auth only] Profile restriction based on team membership
Yes, this team https://github.com/orgs/Hackweek-ITCOocean/teams/itcoocean-hackweek-2023
Hub logo image URL
Hub logo website URL
https://incois.gov.in/ITCOocean/index.jsp
Hub user image GitHub repository
pending
Hub user image tag and name
pending
Extra features you would like to enable
- [ ] Dedicated Kubernetes cluster
- [ ] Scalable Dask Cluster
(Optional) Preferred cloud provider
None
(Optional) Preferred cloud region
us-west-2
(Optional) Billing and Cloud account
None
Other relevant information to the features above
Research hub on a shared cluster (likely hosted in AWS us-west-2). Additional details are being gathered through async g-doc running notes with @eeholmes. Thanks in advance to @jmunroe for assistance with this while @colliand is away on vacation.
This hub will run from early August through mid-October.
Tasks to deploy the hub
- [ ] 1. Deploy information filled in above
- [ ] 2. Engineer who will deploy the hub is assigned
- [ ] 3. If using GitHub Orgs/Teams Auth, Engineer is given Owner rights to the org to set this up.
- [ ] 4. Initial Hub deployment PR
- [ ] 5. Administrators able to log on -> Hub now in steady-state
Re landing page for hub. If it will look similar to the Openscapes hub,
"Openscapes" should be "ITCOocean" with link to https://incois.gov.in/ITCOocean/index.jsp
Funded by: Should be "ITCOocean Hackweek Team" with no url
Thanks @eeholmes!
@colliand I am working on the hub image questions. I need to update our Python image and build our R image.
Hub user images
There are 4. Two for Python, Two for R On DockerHub
Note I think the Geospatial R with SDM has a problem as my kernel is not spinning up when I select that one. but I will fix that later.
Here is the current config file for our daskhub that my team is using during development of the course content. This is a bit of a hack but is mostly doing what we want, which is allowing users to pick different images. Note the machine sizes are not how we want it so ignore that part. Something more like what Openscapes has or whatever you think is best. Currently we keep maxing out the memory when working with big data sets.
Regarding the extra storage. We have a 1 TB shared drive where we have the data and notebooks that the students will be using. It should be read only for the students but read/write for the admins.
Here are the content developers who will need to write to the shared drive: https://github.com/orgs/Hackweek-ITCOocean/teams/hackweek-2023-team/members
jupyterhub:
hub:
config:
GitHubOAuthenticator:
client_id: xxxxx
client_secret: xxxx
oauth_callback_url: xxxx
allowed_organizations:
- nmfs-opensci:DaskHub
scope:
- read:org
JupyterHub:
authenticator_class: github
KubeSpawner:
working_dir: /home/jovyan
allowNamedServers: true
networkPolicy:
enabled: false
readinessProbe:
enabled: false
proxy:
https:
enabled: true
hosts:
- dhub.opensci.live
letsencrypt:
contactEmail: eli.holmes@noaa.gov
singleuser:
defaultUrl: /lab
# Defines the default image
storage:
capacity: 100Gi
extraVolumes:
- name: jupyterhub-shared
persistentVolumeClaim:
claimName: daskhub-pvc
extraVolumeMounts:
- name: jupyterhub-shared
mountPath: /home/jovyan/shared
image:
name: openscapes/python
tag: f577786
profileList:
- display_name: "4 CPU / 32 GB RAM"
description: &profile_list_description "Start a container with at least a chosen share of capacity on a node of this type"
default: true
slug: small
profile_options:
image: &profile_options_image
display_name: Image
choices:
python:
display_name: Geospatial Python
slug: python
kubespawner_override:
image: openscapes/python:f577786
iopython:
display_name: Geospatial Python with tensorflow
default: true
slug: python
kubespawner_override:
image: eeholmes/iopython:20230615
rocker:
display_name: Geospatial R
slug: rocker
kubespawner_override:
image: openscapes/rocker:a7596b5
rocker:
display_name: Geospatial R with SDM
slug: rocker
kubespawner_override:
image: eeholmes/iorocker:20230714
kubespawner_override:
cpu_limit: null
mem_limit: null
node_selector:
node.kubernetes.io/instance-type: Standard_D8s_v3
- display_name: "Tiny: up to 2 CPU / 7 GB RAM"
description: *profile_list_description
slug: tiny
profile_options:
image: *profile_options_image
kubespawner_override:
cpu_limit: null
mem_limit: null
storage_capacity: null
node_selector:
node.kubernetes.io/instance-type: Standard_DS2_v2
dask-gateway:
gateway:
extraConfig:
idle: |-
# timeout after 30 minutes of inactivity
c.KubeClusterConfig.idle_timeout = 1800
Note: we got funding from ESIPfed.org for the hub during the hackweek, so the 'funded by' can say ESIP.
Note please use AWS as the cloud provider. I have AWS cloud credits from another source.
Hi Eli. I'll wait for @colliand to chime in on this, but I think that if you are using AWS cloud credits, I am not exactly sure how we'll be able support a "shared" cluster. Shared (from 2i2c perspective) usually means that 2i2c pays the cloud costs directly and the invoices the each community proportional to their use of cloud infrastructure. On a shared cluster it makes it challenging to use your AWS credits (only) for the INCOIS hub.
Going with a dedicated cluster is one option. And I wonder if there is a way to assign "AWS credits" to 2i2c so that we can use them to pay for just your portion of the cloud costs. In either case, we'll figure something once you return the week of July 31st. It may also depend on the terms and conditions of the AWS credits that you have.
Correct. With current set up, 2i2c does not have a way to apply credits from AWS granted to Eli to cover shared cluster costs.
Correct. With current set up, 2i2c does not have a way to apply credits from AWS granted to Eli to cover shared cluster costs.
Got it. Oh well. I will include cloud computing costs in the proposal I am writing for keeping the hub running after the hackweek.
Updated: images Note, I will likely be updating the images somewhat frequently in the run-up to the hackweek however these images have 90% of what we will need.
FYI, @consideRatio is going to handle the deployment of this new hub request in the next few days.
Hi @eeholmes I've tried to capture a summary below on what I'll now work on deploying.
Misc details
Summary updated in https://github.com/2i2c-org/infrastructure/issues/2821#issue-1805368217
Image choices
eeholmes/iopython:20230714
- Name: Geospatial Python with tensorflow (default)openscapes/python:f577786
- Name: Geospatial Pythonopenscapes/rocker:a7596b5
- Name: Geospatial Reeholmes/iorocker:20230714
- Name: Geospatial R with SDMMachine types
Something more like what Openscapes has or whatever you think is best. Currently we keep maxing out the memory when working with big data sets.
Storage
Regarding the extra storage. We have a 1 TB shared drive where we have the data and notebooks that the students will be using. It should be read only for the students but read/write for the admins.
The users that should have read/write access are defined by the github team in https://github.com/orgs/Hackweek-ITCOocean/teams/hackweek-2023-team/members.
@eeholmes I lack permissions to see members in that team, and I can't use the team name to directly grant the members admin rights because its currently not a feature of jupyterhub/oauthenticator that we rely on.
As a strategy to accomplish this anyhow, I suggest that you manually add the users you want to grant admin membership to via https://itcocean.2i2c.cloud/hub/admin. To make an existing user admin, press the "edit user" button next to the user, and if the user in the team you want to grant admin permission doesn't yet exist, you can add them via https://itcocean.2i2c.cloud/hub/admin#/add-users --- when doing so, use only lower case letters and don't include @
in the usernames
Hi @eeholmes!
I'm an engineer at 2i2c and have now setup a JupyterHub available at https://itcocean.2i2c.cloud. There are a few points I'd like your help with and feedback on.
Initial setup of permissions for login
Could you as part of attempting to login at https://itcocean.2i2c.cloud press the Grant
or Request
button for the Hackweek-ITCOocean
github organization, similar to the grant button next to jupyterhub
seen in this image?
Granting this only needs to be done once by one of you. What we request here is permissions to inspect the members of various teams in the github organization Hackweek-ITCOocean
(read:org
). Specifically, JupyterHub will check if the user is a member of the GitHub organization's team itcoocean-hackweek-2023
to decide if the github user should be authorized access. Note that if you can't or don't want to press Grant, the login isn't expected to succeed unless your github user's membership of the github team has explicitly been made public ahead of time.
Verification of domain name choice
Could you confirm that the domain name itcocean.2i2c.cloud
is acceptable among the choices of <anything>.2i2c.cloud
? Note that we can also make use of a domain not managed by 2i2c as an alternative to this.
Verification of details Could you check if this view on the login page looks correct?
Confirmation on machine type and default request Users are defaulting to get a guaranteed capacity of 8 GB of memory and 1 CPU on 128 GB and 16 CPU based machines, but are not limited to this. To learn about your memory use, you can look at the footer status bar in jupyterlab. If you expect users on average to end up with more memory usage than 8GB, the machine may run out of memory and the users using the most memory is then kicked out.
I configured 8 GB of default memory request as it was a bit higher than the 4 and 7 that in the config you provided that ended up running out of memory when working with larger datasets.
This is how it looks for the user starting a server:
Confirmation if shared storage setup is ok 2i2c provides a setup by default for shared storage that admins can write to and non-admins can only read - is it okay as it is?
It is a setup where admin users see two folders called shared
and shared-readwrite
- its the same folder, and shared
is always "read only" while the admin's seeing also the shared-readwrite
can use that to also write.
An update of image tags? I've configured the images you mentioned, but should I update the image tags once right away?
Hi!
2. Verification of domain name choice
Can you change to itcoocean with 2 "o"'s. The institute name is ITCOocean?
3. Confirmation on machine type and default request
Can you default to something a bit smaller? The admins are maxing out the memory as we are setting up, but the participants in our courses and hackweek will not normally be doing what we are doing. Here is the default for the openscapes.2i2c.cloud hub. This should be good. Note I work with Openscapes and much of our tutorial testing has been on their hub. I am not sure if the fine-grained drop-down re memory is needed? I leave that up to your judgement.
4. Confirmation if shared storage setup is ok
Yes shared drive set-up is good. read-only for participants and read/write for admins.
Question, is it possible for participants to have a second smaller read-write drive that is separate from the admin one? We would like to make it easy for teams to collaborate on shared data and notebooks without having to push that to GitHub.
5. An update of image tags?
The image tags are correct for now.
Can you change the default image to "eeholmes/iorocker:20230714 - Name: Geospatial R with SDM"
Can you change to itcoocean with 2 "o"'s. The institute name is ITCOocean?
Ah nice catch, sorry for the mistake - this is now fixed!
Note I work with Openscapes and much of our tutorial testing has been on their hub. I am not sure if the fine-grained drop-down re memory is needed? I leave that up to your judgement.
Ah then let's leave the drop down as a failsafe if more memory than 1GB on average ends up relevant, but like you suggest let users default to a 1GB capacity on a small server node like is the default for openscapes.
Note that you could have been using more than 1GB of memory when working in openscapes without issues, but if all users in the workshop use 1.2GB of memory there will be issues. With that in mind, it can be good to have an idea on what memory consumption to expect. If users on average will consume <1GB, what we have is perfectly fine even when 60 users are allocated to ~2 nodes/machines.
Question, is it possible for participants to have a second smaller read-write drive that is separate from the admin one? We would like to make it easy for teams to collaborate on shared data and notebooks without having to push that to GitHub.
Yes - we can add a "shared-public" folder that everyone has read/write access to for example. I've not done that right now, but I'll get it done monday the latest.
Can you change the default image to "eeholmes/iorocker:20230714 - Name: Geospatial R with SDM"
Done!
Question, is it possible for participants to have a second smaller read-write drive that is separate from the admin one? We would like to make it easy for teams to collaborate on shared data and notebooks without having to push that to GitHub.
@eeholmes there is now a shared-public
folder!
Visit https://itcoocean.2i2c.cloud/services/configurator/ to decide if you wish to start using a specific user interface (Classical Jupyter, JupyterLab, RStudio).
With this, I think we may be done with the initial setup. Does this look good to you?
I got in aok! A couple minor things.
1) Can the default be "lab" instead of "tree", so this shows when logging in by default
2) Can we have a large machine as an option like one has for openscapes.2i2c.cloud?
Can the default be "lab" instead of "tree", so this shows when logging in by default
Yes! This is something you can also toggle for all users being a JupyterHub admin via https://itcoocean.2i2c.cloud/services/configurator/.
I've now checked JupyterLab to be the desired user interface for servers starting up going onwards.
Can we have a large machine as an option like one has for openscapes.2i2c.cloud?
Yes! Did you mean that you'd like to see users presented with options labelled as Small and Medium below like in your image, or also the option labelled as Large also?
Yes people who log in should see Small and Medium. I don't think we will need Large for the course and I am guessing that would use a lot of cloud $$ if they selected that.
Yes people who log in should see Small and Medium.
:+1:
[...] I am guessing that would use a lot of cloud $$ if they selected that.
Yes, the cost scales quite linearly with the amount of CPU of each started machine. If users spread out on multiple machines when they could fit on one would also incurr more cost. Also, for each machine started a user may need to wait for machine startup, so it can be good to not provide too many options.
@eeholmes the named options Small / Medium are now presented and jupyterlab should startup automatically by default!
Quick question. Is uploading to 2i2c JupyterHubs throttled? I am trying to upload a 18G data file to our shared folder and it is stuck at 1.5G/sec upload speed but I can upload to other sites (Dropbox) at 25G/sec. Thanks! Note I am on oceanhackweek.2i2c.cloud at the moment but I assume if there is throttling, it is a general feature.
Is uploading to 2i2c JupyterHubs throttled? I am trying to upload a 18G data file to our shared folder and it is stuck at 1.5G/sec upload speed but I can upload to other sites (Dropbox) at 25G/sec.
Its not actively throttled by 2i2c explicit configuration, but I think the machine type chosen will influence this where smaller machine types is bounded in their network capacity.
openscapes is running on AWS, and the machine types:
r5.xlarge
by AWSr5.4xlarge
by AWSr5.16xlarge
by AWSBy googling I found this page describing 10 gigabits / sec for r5.xlarge and r5.4xlarge, but higher for r5.16xlarge: https://aws.amazon.com/ec2/instance-types/r5/#Product_Details.
Ah ok. I am on the oceanhackweek.2i2c.cloud JupyterHub and I think they have a pretty small VM. I think smaller than r5.xlarge.
Which brings up another question, is there a way to show the Large option but only for the admins? That way the admins can get on that if needed, but the course participants won't be able to select that option. Thx!
Also my upload speed is 1.5M/sec not 1.5G/sec. So pretty slow. There is some type of throttling going on, perhaps on my side, that is preventing higher upload speeds.
This suggests maybe change something the the ssh config? Scroll to bottom. https://serverfault.com/questions/1032792/uploading-folder-from-local-directory-to-vm-instance-jupyterlab-folder-too-slow
Which brings up another question, is there a way to show the Large option but only for the admins? That way the admins can get on that if needed, but the course participants won't be able to select that option.
Yes! I can be set that up quickly if you provide me a name of a github team of users to be allowed seeing the Large 64 CPU 512GB machine type option.
Also my upload speed is 1.5M/sec not 1.5G/sec. So pretty slow. There is some type of throttling going on, perhaps on my side, that is preventing higher upload speeds.
Could you open a dedicated support ticket about networking speeds if you have issues with them via https://docs.2i2c.org/support/ ?
Hi, Can you add a few more admins to the hub? These should have access to shared-readonly so that they can add to the shared drive for the participants:
GitHub usernames: aditimodi, swarnali1, smitha-br, nim-it
I'm on vacation writing from my mobile phone @eeholmes, could you contact support about this to ensure someone in 2i2c can help you with this?
I am gonna declared this new hub deployment completed with the expectation of any remaining issues being managed and resolved via the support process. Thanks!
The GitHub handle of the community representative
eeholmes (EDIT: fixed typo)
Hub important dates
Hub Authentication Type
GitHub (e.g., @mygithubhandle)
First Hub Administrators
@eeholmes
[GitHub Auth only] How would you like to manage your users?
Allowing members of a specific GitHub organization (EDIT: specific team, see below)
[GitHub Teams Auth only] Profile restriction based on team membership
EDIT: https://github.com/orgs/Hackweek-ITCOocean/teams/itcoocean-hackweek-2023
Hub logo image URL
EDIT: https://user-images.githubusercontent.com/2545978/253672085-ec5ca6fb-147b-4fcd-87f1-431108b62558.png
Looks like:
Hub logo website URL
https://incois.gov.in/ITCOocean/index.jsp
Hub user image GitHub repository
pending
Hub user image tag and name
pending
Extra features you would like to enable
(Optional) Preferred cloud provider
AWS
(Optional) Preferred cloud region
us-west-2 (shared 2i2c cluster: 2i2c-aws-us)
(Optional) Billing and Cloud account
None
Other relevant information to the features above
Research hub on a shared cluster (likely hosted in AWS us-west-2). Additional details are being gathered through async g-doc running notes with @eeholmes. Thanks in advance to @jmunroe for assistance with this while @colliand is away on vacation.
This hub will run from early August through mid-October.
When filling in operated by, funded by, etc fields (https://github.com/2i2c-org/infrastructure/issues/2821#issuecomment-1636388679):
Tasks to deploy the hub