zonca / jupyterhub-deploy-kubernetes-jetstream

Configuration files for my tutorials on deploying JupyterHub on top of Kubernetes on XSEDE Jetstream (Openstack)
https://zonca.dev/categories/#jetstream
23 stars 14 forks source link

Proposed Shift to Minimalist Linux Distro for K8S JupyterHub VMs #68

Closed julienchastang closed 2 months ago

julienchastang commented 11 months ago

During recent review of the nodes in our K8S JupyterHub clusters, I've identified some areas where we could potentially optimize performance. I've observed that our current Ubuntu VMs appear to carry some excess in terms of services and software. Notably, services like CUPS (used for printing) and unused packages such as openjdk seem to be utilizing more space and processing power than we actually need.

Considering this, I'd like to put forth a suggestion for us to discuss: How about we explore the possibility of creating a “Featured” VM based on a more minimalist Linux distribution, perhaps something like Debian Netinstall? I'm not entirely versed in all Linux distros, so I'm open to suggestions. The aim here would be to more precisely tailor our setup to the specific requirements of a JupyterHub K8s node, thereby reducing any unnecessary load.

cc @ana-v-espinoza @m1schmidt @jlf599

zonca commented 11 months ago

I think it is a good idea.

here is the list of OS supported by Kubespray:

Flatcar Container Linux by Kinvolk

Debian Bullseye, Buster, Jessie, Stretch

Ubuntu 16.04, 18.04, 20.04, 22.04

CentOS/RHEL 7, 8, 9

Fedora 35, 36

Fedora CoreOS

openSUSE Leap 15.x/Tumbleweed

Oracle Linux 7, 8, 9

Alma Linux 8, 9

Rocky Linux 8, 9

Kylin Linux Advanced Server V10

Amazon Linux 2

https://kubernetes.io/docs/setup/production-environment/tools/kubespray/

julienchastang commented 11 months ago

Flatcar Container Linux's emphasis on security, immutability, and container-based cloud infrastructure is intriguing. I'm curious about the package installation process on such an immutable system and whether it would suit our needs. However, exploring Flatcar could be a worthwhile experiment for us to consider.

julienchastang commented 10 months ago

To give a bit more context, I've been working on this. Which removes:

ubuntu-desktop-minimal metapackage
 GNOME
 OpenJDK
  - openjdk-11-jdk-headless
  - openjdk-11-jre-headless
 Emacs
  - emacs-gtk
  - emacs-common
 Firefox
 Git
 snapd

Saving > 1GB of disk space on each JupuyterHub node VM

and turns off

 CUPS (for printing!?)
 ModemManager 
 Apport 
 avahi-daemon 
 openvpn 
 kerneloops 
 whoopsie 
 snapd 
 software.automount 

In short, there is really a lot of bloat on these JS2 Ubuntu VMs. Something tailored to JupyterHub nodes could be something to consider and I imagine could benefit a fair number of JS2 users, not just Unidata.

jlf599 commented 10 months ago

A lot of the things here are compromises to make the installation of desktop bits (which a lot of users want) suck a bit less during launch.

We now use an image pipeline for weekly builds so making a minimal install may be possible, but we'd also have to look at dependencies and making sure things don't get broken.

Feel free to open an issue here with this information: https://gitlab.com/jetstream-cloud/image-build-pipeline/-/issues

and the folks doing dev work (which sometimes includes me) might address some of it.

zonca commented 10 months ago

@julienchastang I think it would be useful to open an issue in that repository and see if they would be interested in adding a new image Ubuntu-22-minimal. Julien, with your experience in removing extra packages from Ubuntu-22, you could go through the ansible recipes in that repository, make a copy of Ubuntu-22, remove all steps where extra packages are installed, and also copy some tasks from your ansible recipes to remove other pre-installed packages. Of course only after having approval from them that this is a useful image to maintain long-term.

julienchastang commented 10 months ago

I still aim on opening that JS2 issue, but have not gotten around to it yet. @jlf599 As an aside, one of the things I do to remove bloat from the JS2 Ubuntu VMs is umount /software, but then when logging in to such a VM yields:

Lmod Warning:  MODULEPATH is undefined.

Lmod has detected the following error:  The following module(s) are unknown:
"xalt"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "xalt"

Also make sure that all modulefiles written in TCL start with the string #%Module

It appears there are dependencies on /software packages on these JS2 Ubuntu VMs. I don't remember that from the past, but I may be misremembering.

jlf599 commented 10 months ago

The lmod package gets installed as part of the build process -- path info is set in /etc/lmod/modulespath

Probably should remove lmod if you don't need it.

The xalt error is new-ish because we're starting to track software launches from /software -- it defaults to on. That is a behavior change.

Also, to keep in mind, the "bloat" is what most users are looking for. You have a very different use case. :)

Not to say others might not benefit from a stripped down version, but a goodly number of our users are interactive users which is why that has been our primary focus. With the build pipeline, making a cleaner version gets easier.

If you're interested in a Debian version, we'd talked internally about a Debian services/stripped down version. But no worries if not.

zonca commented 9 months ago

thanks @jlf599, I think a Debian would work very well for our use case. If you have any beta-quality image you would like to share, we would be happy to test it!

jlf599 commented 9 months ago

There's no version of this available yet. It's just something we had discussed. I'll let you know when this moves forward.

julienchastang commented 9 months ago

OK, I finally got around to opening an issue here: https://gitlab.com/jetstream-cloud/image-build-pipeline/-/issues/33. Sorry for the long delay.

zonca commented 6 months ago

let's follow up on https://gitlab.com/jetstream-cloud/image-build-pipeline/-/issues/33, however I'll keep the issue open for tracking purposes.

Latest development is Flatcar Linux: https://www.zonca.dev/posts/2024-04-30-flatcar-image-jetstream

julienchastang commented 2 months ago

I think we can close this issue now that we have Featured-Minimal-Ubuntu24 on JS2.