det-lab / jupyterhub-deploy-kubernetes-jetstream

CDMS JupyterHub deployment on XSEDE Jetstream
0 stars 1 forks source link

Redeploying to Jetstream 2 [was: Stuck on "your server is starting up"] #62

Closed pibion closed 2 years ago

pibion commented 2 years ago

Spawning a jupyter instance gets stuck in "your server is starting up" mode for more than several minutes and then fails.

rahmanole commented 2 years ago
Screenshot 2022-03-03 at 17 53 53
pibion commented 2 years ago

I did check and there are no other instances running (we get a different error when there aren't enough resources, so this is probably a moot point).

zonca commented 2 years ago

The cluster has been up for more than a year and it's really nice that kubernetes has a security feature 🤔 where the internal certificates expire after a year... So I will need some time to figure out how to properly renew the certificates. I will update here once I'm done.

zonca commented 2 years ago

Certificate renewal failed, and by mistake a couple of months ago I overwrote the kubespray cluster management files on my machine, won't happen again, I will now use separate directory trees. So I need to redeploy. Then I can restore files from backup.

Before redeploying, I asked the Jetstream team if Jetstream 2 is available. If it is, I would prefer to redeploy directly there.

zonca commented 2 years ago

@pibion Jetstream 2 is in early testing and they can let us in. Jetstream 1 will be operational until summer, but they will soon encourage transition.

pibion commented 2 years ago

Yes, let's move to Jetstream 2! If you could re-create the data volume on Jetstream 2 that would be ideal.

It's okay if that fails, though, we can re-populate if necessary.

zonca commented 2 years ago

ok, I requested the transfer to Jetstream 2, we will keep the allocation on Jetstream 1 for a few weeks, so I'll try to directly copy data from the old volume to the new one.

pibion commented 2 years ago

@zonca any success with the authentication on Jetstream 2?

zonca commented 2 years ago

Yes! Will try deployment in the next days

pibion commented 2 years ago

@zonca that's great news! I had a student contact me today asking about our jetstream instance being down - he uses it often for code development. I was unusually happy to get a complaint since it means people are using the system!

zonca commented 2 years ago

ok, it's working on Jetstream 2.

I deployed it temporarily on https://supercdms.zonca.dev/, I'll move to the old url in the next days.

I haven't imported any backup yet.

zonca commented 2 years ago

we also have a newer version of Jupyterhub (JupyterHub 1.5.0 20220317202953)

zonca commented 2 years ago

ok, I started the transfer of all the data from the data volume, it will take a few hours.

zonca commented 2 years ago

I think it is better if the users first login to the system, then who wants to have their old data restored, can write me and I can copy the data from the backup into their new volumes.

pibion commented 2 years ago

Excellent, I'll have them ping you here if they'd like their backup restored!

zkromerUCD commented 2 years ago

@zonca Hi, Zonca - I would like my home directory restored - my username on jetstream is zkromer.

zonca commented 2 years ago

sure @zkromerUCD , can you please first login to the system so the new volume is created?

zkromerUCD commented 2 years ago

ok, I can do that

zonca commented 2 years ago

ok, I started the transfer of all the data from the data volume, it will take a few hours.

completed, I transferred 318 GB, notice that the IP to access the data volume changed, see the repository

pibion commented 2 years ago

@zonca great, I can see the expected files in /cvmfs. There is some weirdness with /cvmfs/data/CDMS/Soudan/DMC_V1-5_PhotoneutronSb/Raw/Raw, not sure if that's an issue with the setup or with how we created the directories.

zonca commented 2 years ago

not sure, I just copied everything with scp -r from one volume to the other

zonca commented 2 years ago

ok, I consider this completed, @pibion if there is any other issue please open a dedicated issue.

For restoring user data, let's use https://github.com/det-lab/jupyterhub-deploy-kubernetes-jetstream/issues/64 instead.

zonca commented 2 years ago

@pibion @zkromerUCD we are back at the original URL: https://supercdms.jetstream-cloud.org/