default value for $HOME when adding the application user

dmarth commented 5 years ago

The base image (centos/python-36-centos7) uses /opt/app-root/etc/passwd instead of the system's /etc/passwd file. When this file is updated to contain the default application user (by sourcing /opt/app-root/etc/generate_container_user), the $HOME environment variable is used the specify the home directory: https://github.com/sclorg/s2i-python-container/blob/1ff8621371c9c6c39cadea5e18cf0b6a2275a949/3.6/root/opt/app-root/etc/generate_container_user#L12

Some scripts (for example cull-idle-servers) indirectly source this file: https://github.com/jupyter-on-openshift/jupyterhub-quickstart/blob/e5c38ddea76c8f5689c776bdc898e037a5b2e66a/.s2i/bin/assemble#L56 https://github.com/jupyter-on-openshift/jupyterhub-quickstart/blob/e5c38ddea76c8f5689c776bdc898e037a5b2e66a/scripts/cull-idle-servers#L3

However, JupyterHub services are started in a rather minimal environment: https://github.com/jupyterhub/jupyterhub/blob/e4d4e059bd6ecc749a6276a80eada8f0de8ad206/jupyterhub/services/service.py#L317

This results in $HOME not being set and the user having no home directory after the service is executed. For applications relying on the passwd file to get the user's home directory afterwards, this can lead to problems.

As an example, ssh defaults to the root directory to create the .ssh directory in case the home directory is not set: https://github.com/openssh/openssh-portable/blob/4d28fa78abce2890e136281950633fae2066cc29/ssh.c#L1427 Non-root users have no write permissions there and the execution fails.

With this merge request I make sure that $HOME is set in the /opt/app-root/etc/scl_enable script before sourcing /opt/app-root/etc/generate_container_user. Please let me know in case there is a better place to add this.

GrahamDumpleton commented 5 years ago

What is the service you are trying to add?

The stripping of most environment variables by JupyterHub is a pain for other reasons. Usually one deals with it by setting environment in the service definition to pass through variables from the JupyterHub parent process environment. For example:

    c.JupyterHub.services.extend([
        {
            'name': 'delete-projects',
            'command': delete_projects_args,
            'environment': dict(
                PYTHONUNBUFFERED='1',
                APPLICATION_NAME=application_name,
                KUBERNETES_SERVICE_HOST=os.environ['KUBERNETES_SERVICE_HOST'],
                KUBERNETES_SERVICE_PORT=os.environ['KUBERNETES_SERVICE_PORT']
            ),
        }
    ])

Even so, with your change you would still be needing to execute scl_enable in any script you are starting up to fix up the view of the passwd file and HOME, like in delete-projects.sh, so you could also set HOME there.

#!/bin/bash

source /opt/app-root/etc/scl_enable

exec python /opt/app-root/src/scripts/delete-projects.py

Anyway, am not really inclined to add setting of HOME in scl_enable. If anything would prefer to completely scrap how the s2i-python-container base image handles that and do it in a completely different way which doesn't use libnsswrapper, which fails in various ways.

So in pursuit of fixing this in an entirely different way, can you explain a bit more about how you are using the JupyterHub service feature so I understand what you are using it for, and I can think about whether the different changes I have in mind would work with it.

dmarth commented 5 years ago

Actually I am not adding a custom service at all, I am only using an unmodified version of the cull-idle-servers service.

When a container is started, I clone/pull a git repository via ssh in the spawner and copy it to a volume shared between the hub and the notebook. However, at this point the git command fails because the cull-idle-servers service was triggered right after the hub was started and sourced the user generation without home directory. So the problem is not a broken service but a service breaking functionality in the spawner.

Passing the environment variable to the service from the hub is a workaround I did not think about, thanks for pointing this out!

GrahamDumpleton commented 5 years ago

I am a bit confused how cull-idle-servers come into this if you are using an unmodified version of it, as it can't be adapted to clone/pull a git repository so not sure how it is related. Where exactly is the ssh triggered/run and the git clone done?

FWIW, if the alternate fix for passwd file is done that I am thinking of, it will mean that it will be impossible to use S2I to build the jupyterhub-quickstart image. Thus build-configs/jupyterhub.json would flick to a docker type build. Am not sure this would cause a big issue as these days should be using the pre-built image from quay.io anyway. That an S2I build was being used was only to allow building of it from repo on clusters such as OpenShift Online where a docker type build is prohibited. That was mainly for me, but I now use other clusters with no such restriction and also rely on quay.io to build images for real use anyway.

dmarth commented 5 years ago

The git commands are not executed in the cull-idle-servers service but in the start method of the spawner (I extend kubespawner). In general the service should not have any influence on the spawner, but the change of the passwd file triggered by the cull-idle-servers service has impact on the whole container.

The timeline of events is as follows:

The container is started, $HOME is set to /opt/app-root/src: https://github.com/sclorg/s2i-base-container/blob/7a7a80ef5eee3ff5112c94094b23d87142bc08b8/core/Dockerfile#L27
At some point during the startup process the generate_container_user script is sourced correctly, I am not sure where exactly this happens, but several Bash-related environment variables are set to /opt/app-root/etc/scl_enable. The passwd file now contains the user with the correct home directory.
JupyterHub is started, everything is still fine.
cull-idle-servers is started with a minimal environment (without $HOME) and sources the generate_container_user script. This effectively removes the home directory of the user in the passwd file.
A user attempts to spawn a container. The spawner tries to execute some git/ssh commands during the container setup, but this fails because of the missing home directory.

GrahamDumpleton commented 5 years ago

Okay. I understand now.

So there are a few options as to how to fix it.

The first is to add HOME to environment variables passed through to cull-idle-servers so that generate_container_user does wipe the existing home directory in /tmp/passwd.

The second is to not even invoke generate_container_user from cull-idle-servers and instead pass in environment to it HOME, LD_PRELOAD and the nsswrapper environment variables.

A third is to ditch use of nsswrapper and use writable /etc/passwd and entry point which fixes it up as soon as the container starts. There are extra protections which need to be added to deal with root escalation if container image used outside of OpenShift for this case.

This third one is sort of the best, but as already mentioned means can't build as S2I anymore, which as said I don't think is a big issue.

In doing the third option I would be refactoring a lot of the startup sequence to introduce hooking points where you could run scripts on startup of the container. It sounds like these hook points would be a better place to do your setup than having to override kubespawner. Although, personally, it sounds like what you are doing would be better off done using an init container in the same deployment, rather than trying to customise the JupyterHub spawner. Have you looked at using an init container?

dmarth commented 5 years ago

I will go for the first option for now:

c.JupyterHub.services = [
    {
        # ...
        'environment': {
            "HOME": os.environ["HOME"]
        }
    }
]

I did not yet look into init containers, but I will definitely check this out.

Thanks a lot for your support! Of course feel free to just close this pull request, there are for sure better ways to work around this.

jupyter-on-openshift / jupyterhub-quickstart

default value for $HOME when adding the application user #21