2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 63 forks source link

[support] Resize Filestore for latam.catalystproject.2i2c.cloud #4257

Closed jmunroe closed 2 months ago

jmunroe commented 3 months ago

To support a short term requirement for a community on the latam.catalystproject.2i2c.cloud , I increased the Filestore size from 1TB to 6TB. I have now worked with the community to reduce their storage usage on that Filestore instance to more manageable 942GB (by off loading to persistent object storage)

I may be able to continue to reduce it as I work with individual users in this community. This cluster supports a total of ~10 hubs so I think that a Filestore allocation of 2TB is a reasonable tradeoff between minimizing cost and ensure communities do not run out of space.

Request: I realize now that reducing a Filestore allocation size is, unfortunately, not a trivial operation. Would @2i2c-org/engineering in its next iteration please:

  1. Create a new Filestore instance of 2TB
  2. Copy/sync the contents of the current Filestore instance to this new instance
  3. Switch all the hubs on latam.catalystproject.2i2c.cloud to this new Filestore
  4. Remove the old Filestore

(Or whatever procedure is recommended by @2i2c-org/engineering )

### Tasks
- [ ] https://github.com/2i2c-org/infrastructure/issues/4334
- [ ] https://github.com/2i2c-org/infrastructure/issues/4335
- [ ] https://github.com/2i2c-org/infrastructure/issues/4336
- [ ] https://github.com/2i2c-org/infrastructure/issues/4337
- [ ] https://github.com/2i2c-org/infrastructure/issues/4368
- [ ] https://github.com/2i2c-org/infrastructure/issues/4338
- [ ] https://github.com/2i2c-org/infrastructure/issues/4369
haroldcampbell commented 3 months ago

This needs to be further refined by @yuvipanda

yuvipanda commented 3 months ago

The core issue is that we use the cheapest BASIC_HDD tier, which only allows increasing size, not decreasing size.

So the steps here would be:

  1. Create a new Filestore instance
  2. Create a VM
  3. Attach old and new filestore instance to the VM
  4. Manually copy files over (via rclone probably - sudo rclone sync --multi-thread-streams=12 --progress --links <src> <dst>
  5. Wait for all users on all the hubs in the cluster to not be around (or kick them all out)
  6. Do another round of rclone to finish the copying
  7. Fiddle with terraform state rm and terraform import to 'import' the new filestore we created into terraform state
  8. Use the new IP of new filestore in all the hub config files, and redeploy them
yuvipanda commented 3 months ago

Instead of fiddling around with terraform state (step 7 from above), we should do https://github.com/2i2c-org/infrastructure/issues/4334 instead.

sgibson91 commented 2 months ago

All the tasks are now complete! 🎉