2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 63 forks source link

Callysto core nodepool rightsizing #2005

Closed ianabc closed 1 year ago

ianabc commented 1 year ago

Context

The callysto hub infrastructure is currently running on a pair of n1-highmem-4 (GCE) nodes which are relatively expensive, and I was asked to check that these are the right size for the job. Poking around in our grafana stats, it looks like they might be a bit underutilized, would it make sense to resize these to n1-highmem-2 nodes? I think the default cluster config uses n1-highmen-2 with the caveat

For single-tenant clusters, a single n1-highmem-2 node seems enough - if network policy and config connector are not on. For others, please experiment to see what fits.

but we override that in our current terraform config. Are there any reasons for us not to resize to n1-highmem-2 nodes?

If it is possible for us to do this, do we just propose the change to the terraform config? I guess applying the change would tear down our current cluster and redeploy it; that would be OK, but I would the user data (FileStore volume) be preserved and re-attached?

Proposal

Explore the possibility of resizing the core node pool of the Callysto cluster. If it is feasible, redeploy.

yuvipanda commented 1 year ago

@ianabc we can try it out! However, I am curious if that is enough for the prometheus nodes.

ianabc commented 1 year ago

Thanks, I think we might give it a go and see what happens. My understanding is we would make the update in terraform and then issue a PR, but I'm a little fuzzy on what will happen when we merge the PR. I think it has to destroy the existing code nodepool and create a new one, but can you confirm that it would leave the user storage intact? i.e. when it adds the new core nodepool and spins things up would existing users see their existing files? Or should we arrange to have them backed up somewhere then reimported?

sgibson91 commented 1 year ago

@ianabc User's home directories are stored in a Google Filestore instance that is mounted to the user servers via NFS. I don't foresee why terraform would delete the filestore just to replace one nodepool, so I think we are ok - but you never truly know until you run terraform plan!

consideRatio commented 1 year ago

Closing via: