det-lab / jupyterhub-deploy-kubernetes-jetstream

CDMS JupyterHub deployment on XSEDE Jetstream
0 stars 1 forks source link

only provide the most recent kernel? #42

Open pibion opened 4 years ago

pibion commented 4 years ago

I think it might be preferable to list only the most recent CDMS kernel - the list of available kernels is getting pretty long!

What do you think @bloer, @zonca?

bloer commented 4 years ago

@pibion is there a way to access earlier kernels if they're not in the list? Part of the goal of CVMFS was to make sure that old releases are still around (a) for reproducibility and (b) so that your analysis won't break every time a new release comes out.

Since this is early days, it probably wouldn't hurt to trim the older kernels from the list. But eventually all the kernels should stick around forever or very nearly so. Hopefully that will correspond to much slower release cycles, although I don't see that happening particularly soon

zonca commented 4 years ago

as the functionality to add kernels is provided by the CDMS software stack, it would be better to implement there something like prune_old_kernels available in the path which for now let's say keeps only the last 5 kernels and deletes the rest. In the future @bloer could change internally what this function does (for example deletes all kernels before let's say June 2021), but nothing changes from the users perspective

zonca commented 4 years ago

@bloer what do you think about adding a prune_old_kernels script to the CDMS software stack?

zonca commented 4 years ago

@bloer what do you think about adding my idea of adding a prune_old_kernels script to the CDMS software stack?

pibion commented 4 years ago

@zonca it turns out this would be useful for other deployment sites as well. The thing we're struggling with is that we'd like users to be able to access old versions for reproduciblity purposes. So then the question becomes: what versions do we really need to keep?

zonca commented 4 years ago

@pibion as we have control of the prune_old_kernels script and we can modify it (maybe be an option of setup_cdms.sh?), we don't need to decide what versions to keep now.

pibion commented 4 years ago

@zonca I really like the idea of keeping the five most recent major releases. @bloer how does that sound to you?

bloer commented 4 years ago

We're not deleting anything from cvmfs, so users can always reinstall an older kernel if needed. setup_cdmsh.sh is already getting kind of clunky so I don't really want to bolt on more functionality that's only loosely related. But we could stick it in a 'tools' folder or something right under /cvmfs/cdms.opensciencegrid.org/ so it's outside the release trees.

Can you remind me again how the kernel list is generated at XSEDE? At most of our sites, users have to install new kernels manually, but if I recall it's being done automatically at XSEDE, right? So the plan would be to also run the prune script automatically? Would a user wanting to stick with an older kernel have to manually reinstall it every session in that case?

zonca commented 4 years ago

we have a install_cdms_kernels script you have written and it is baked into the image: https://github.com/zonca/docker-jupyter-cdms-light/blob/master/install_cdms_kernels

it has to be executed manually (I tried to automate but failed, see #27), so I would do a prune_cdms_kernels script also to be manually run.

So maybe we should move install_cdms_kernels to CVMFS in a tools/ folder?

bloer commented 4 years ago

I'm working on this, but hit an annoying snag. The "right" way to do this is with the jupyter kernelspec command, but jupyter might not be available if you don't setup a specific release first.

zonca commented 4 years ago

@bloer what about having a minimal miniconda env just to handle this operation?

zonca commented 3 years ago

@bloer would you have any update on this?

zonca commented 3 years ago

@bloer I am reviewing the old issues, any suggestions about this? I was thinking about a minimal miniconda env just dedicated to handling kernelspecs