jupyterhub / systemdspawner

Spawn JupyterHub single-user notebook servers with systemd
BSD 3-Clause "New" or "Revised" License
92 stars 45 forks source link

Transition towards passing through config to Systemd spawned services without providing dedicated config? #128

Open consideRatio opened 1 year ago

consideRatio commented 1 year ago

Learning more about systemd, I understand that we may want to expose a lot of configuration that could just go straight to how we configure the started systemd transient service. If its just passthrough configuration, should this project then provide a dedicated python traitlets based config option for it that we need to maintain? We do so for a few things now, but should we keep adding config to pass through systemd config?

Consider for example, in #125 we have a proposal to add SystemdSpawner.cpu_weight which maps directly to configuring CPUWeight with associated docs, and we provide similar docs as Systemd does.

In the jupyterhub distribution https://github.com/jupyterhub/zero-to-jupyterhub-k8s, we have had a similar situations that I found to be unsustainable. We provided the same config options for the Helm chart the project represents to configure the underlying GitLabOAuthenticator class for example. As soon as there was something new to configure, and a user wanted it, we had to expose it as a Helm chart config option. We decided that we simply shouldn't try to provide a 1:1 mapping of chart config, but instead provide a way to passthrough config, so that no matter what new functionality was introduced in some software dependency, it could be configured without the Helm chart knowing about it.

SystemdSpawner is in a similar situation I think, where Systemd has a lot of config options, and SystemdSpawner creates python based config to map to it.

I suggest we overview the existing config, and start ensuring users become confident on how to configure the spawned Systemd services with passthrough config directly, instead of relying on SystemdSpawner to be aware of all options provided by Systemd as it changes.

ticoneva commented 1 year ago

In general, I agree that adding more systemd options to the spawner is not a sustainable solution. I am not sure why the base jupyterhub/spawner has cpu_limit and mem_limit set as unimplemented traitlet options---perhaps it's because they both involve changing two properties? If that's the reason, then cpu_weight is in the same situation.

I personally find cpu_limit of very llimited use in comparison to cpu_weight---because affected processes are unaware of the limit, multithread performance is generally very poor. It just feels weird to me that the former is a traitlet option when the latter is not.

behrmann commented 1 year ago

Yes, I too think that mirroring all options is unsustainable. It's also quite limiting. To be honest I've never used vanilla systemd-spawner in production, but always my private forks, which you can find in two PRs, the most recent one being #100. The general reasoning behind that is that I don't want jupyterhub to be running as root, which it needs to be (or at least need to be for all intents and purposes) to use systemd-run, which is makes the spawner just a dispatcher for a single type of service that then can use whatever systemd offers.

That being said in the context of using systemd-run, it would probably make most sense to deprecate all options and just have a config dict of systemd unit properties and values and use that.

manics commented 1 year ago

I agree with moving to pass-through configuration where possible. I think we could also make the case that pass-though configuration should override the traitlets configuration so that configuration doesn't have to be split across two interfaces.

@ticoneva Many of the existing traitlets in the parent class are based on what's generically useful across multiple spawners. CPU/memory limits are supporting in many environments such as containers, HPC, local systems without systemd, and are easy to understand.

yuvipanda commented 1 year ago

I'm generally in favor. https://github.com/jupyterhub/kubespawner/issues/248 is the relavant issue in kubespawner.

As the person who made this choice in both KubeSpawner and SystemdSpawner initially, I want to apologize for making this particular choice. My intent was that folks should be able to use KubeSpawner or SystemdSpawner without having to fully understand kubernetes or systemd. However, that is a function of JupyterHub distributions (like z2jh or TLJH), which did not exist at that time. Hindsight 20/20, etc.