2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
108 stars 65 forks source link

[Request deployment] New Hub: demo-drakkar #2049

Closed auraoupa closed 1 year ago

auraoupa commented 1 year ago

Important dates

Hub Authentication Type

GitHub (e.g., @mygithubhandle)

First Hub Administrators

Aurélie Albert, aurelie.albert@univ-grenoble-alpes.fr, @auraoupa Takaya Uchida, takaya.uchida@univ-grenoble-alpes.fr, @roxyboy

[GitHub Auth only] How would you like to manage your users?

Manually, by adding specific GitHub handles in the JupyterHub Admin panel

[GitHub Teams Auth only] Profile restriction based on team membership

No response

Hub logo image URL

https://drakkar2023.sciencesconf.org/data/header/DrakkarOcean.png

Hub logo website URL

https://drakkar2023.sciencesconf.org/

Hub user image GitHub repository

https://github.com/auraoupa/hub-user-image-drakkar-demo

Hub user image tag and name

quay.io/auraoupa/drakkar-demo:62fa0b23aca5

Extra features you'd like to enable

(Optional) Preferred cloud provider

GCP (preferred)

(Optional) Billing and Cloud account

None

Other relevant information to the features above

No response

Tasks to deploy the hub

jmunroe commented 1 year ago

See https://github.com/2i2c-org/leads/issues/105 for related lead information.

consideRatio commented 1 year ago

Hi @auraoupa and @jmunroe!

I'm looking into deploying this hub and look to verify the following points first. Does this look right to you?

/ Erik

auraoupa commented 1 year ago

Hi @consideRatio, Thanks for taking care of setting up our demonstration hub ! My answers :

  • I understand this as a request to add a dedicated JupyterHub to run in the pre-existing GCP based meom-ige cluster.

Yes it will only be used for 3-days demonstration at the Drakkar meeting by around 40 people at a time.

  • We need a domain name for this hub, so would drakkar.meom-ige.2i2c.cloud be okay or should we choose something else?

I was suggesting drakkar-demo.2i2c.cloud but yours is fine too

  • The JupyterHub should be a standard jupyterhub without dask-gateway integration, so even though meom-ige.2i2c.cloud is what we call a "daskhub" where users can start and manage their own dask clusters with a dask_gateway client, this would be what we call a "basehub" where users can't use dask_gateway.

I am not sure about this one, I want the users to be able to launch a cluster in their notebook and do parallel computation, but not necessarily scale it or have a choice in the size of the server when logging.

consideRatio commented 1 year ago

Thank you @auraoupa!

Does the code of "launch a cluster" involve "import dask_gateway"?

The concept of cluster is vague, you can have a local cluster to use many processes on the same server, but you can use dask_gateway to start external servers to communicate against.

auraoupa commented 1 year ago

No dask_gateway indeed, only dask.distributed

consideRatio commented 1 year ago

@auraoupa the hub is now available at https://drakkar-demo.meom-ige.2i2c.cloud/

This hub is not exactly like https://meom-ige.2i2c.cloud/, with some differences I want to highlight:

  1. Anything stored in the "shared" folder at drakkar-demo is separate from what you put in the shared folder at the other hub
  2. Users are not setup with credentials to access a "scratch bucket"
  3. Users are not able to start dedicated dask workers with dask_gateway via python code involving import dask_gateway

Does the hub at https://drakkar-demo.meom-ige.2i2c.cloud/ meet your needs for the drakkar-demo event?

auraoupa commented 1 year ago

Nice, the hub looks fine ! I already have to change the docker image as I forgot some librairies in the previous one ... But I will do it via the configuration in the control panel. Can I already use it to test my demo or is there still work on your side before it is operational ? Thanks @consideRatio that was fast !

consideRatio commented 1 year ago

Excellent @auraoupa! You can absolutely use the configurator at https://drakkar-demo.meom-ige.2i2c.cloud/services/configurator/ to choose a new image to use, and you can use the hub as you wish already.

auraoupa commented 1 year ago

Hi @consideRatio, sorry to reopen this thread : I have a small issue while opening dataset from pangeo catalog that I do not have on meom-ige cloud deployment (even with the same docker image)

I try to open the data with :

from intake import open_catalog
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean.yaml")
ds  = cat["sea_surface_height"].to_dask()

and after a long wait I get :

ValueError: Bad Request: b/pangeo-cmems-duacs
User project specified in the request is invalid.

Is there anything different between the 2 hubs that can explain such behaviour ?

consideRatio commented 1 year ago

@auraoupa pweh, I managed to resolve this in my third attempt! Thank you soo much for providing not only a well supported diagnosis and a way for me to reproduce the issue and check if I resolved the issue or not!

auraoupa commented 1 year ago

Thank you so much for fixing this @consideRatio, I don't have to modify my demo for Monday then ! One last request : is it possible to limit the server spanning options to only Large (16CPU, 64Gb RAM) so that users don't choose the other ones ?

consideRatio commented 1 year ago

Yes, I will get it done!

auraoupa commented 1 year ago

Perfect ! Thank you for all your help deploying this hub, I guess all is ready for it to run this Monday 9am (CET) :smiley: Since it would be very early for you, is there someone based in Europe that I can contact in case something goes wrong ?

consideRatio commented 1 year ago

@auraoupa yes contact us via https://2i2c.freshdesk.com/support/home (or email support@2i2c.org).

I want to make sure that we do what we can to accomodate the users, and while cloud providers can provide many machines - we must ask for the ability to start many servers and they must approve it.

How many users will start and run servers at the same time? If they all are to be granted a 16 CPU machine, I suspect I must also request an increase in the allocated quota granted by the cloud provider. Using the same quota, you can fit twice as many users with 8 CPU nodes btw.

auraoupa commented 1 year ago

Ok it will be 40 people maximum, we can make it with 8 CPU nodes for sure. I can always ask them to pair up two by two if that is too much at the same time ...

consideRatio commented 1 year ago

@auraoupa I've checked the quotas, and it should be fine if you end up with 100 people it seems!

I just learned that using Google's cloud, as we do here, is far more generous with the quotas provided than Amazon's cloud. So, this wasn't an issue really.

colliand commented 1 year ago

Hi @auraoupa and @roxyboy! Should 2i2c decommission this hub since the event is now over? Please let us know.

roxyboy commented 1 year ago

Yes, I think this should be decommisioned.

damianavila commented 1 year ago

Decommissioned via https://github.com/2i2c-org/infrastructure/pull/2571.