Closed ckunki closed 5 months ago
On host:
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 Feb 5 13:53 /var/run/docker.sock
Mouting the docker.sock
:
$ docker run -d \
--volume /var/run/docker.sock:/var/run/docker.sock \
exasol/ai-lab:9.9.9
Inside container:
# ls -l /var/run/docker.sock
srw-rw---- 1 root 118 0 Feb 5 13:53 /var/run/docker.sock
Socket is owned by root
with groupid 118.
Error message:
ERROR luigi-interface.PrepareDockerNetworkForTestEnvironment:task_logger_wrapper.py:25 PrepareDockerNetworkForTestEnvironment_9cbf939b25(job_id=2024_03_01_14_16_35_2_SpawnTestEnvironmentWithDockerDB, no_cache=False, environment_name=DemoDb, network_name=db_network_DemoDb, attempt=0): Error during removing container db_network_DemoDb: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))
The docker.sock
is mounted only when running the Docker container.
At this late point in time, Ansible has finished long ago and entry point of the Docker container has been started already.
In consequence, we cannot change owner or permissions of docker.sock
anymore.
We will try to run the Docker container as root
, instead, and only use user jupyter
for running the jupyter server.
Tasks:
juypter
and group jupyter
: pwd.getpwnam(self.name).pw_uid
subprocess.Popen()
:
jupyter
: os.setresuid(uid, uid, uid)
HOME
to make jupyter create its files there/var/run/docker.sock
exists and if so, then change its owner to allow user jupyter
to access it: os.chown(path, self.id, unchanged_gid)
I verified that os.chown("/var/run/docker.sock")
inside the Docker container does not affect the owner of the original file in the host's file system, that is mounted into the Docker container with option -v <host>:<in-container>
.
Results of env
in Docker container for user root
:
Rating | Variable | Value |
---|---|---|
✅ (1) | DEBIAN_FRONTEND | noninteractive |
✅ (1) | _ | /usr/bin/env |
✅ (1) | HOSTNAME | 1a048a416c53 |
✅ (1) | PWD | / |
✅ (1) | LS_COLORS | ... |
✅ (1) | TERM | xterm |
✅ (1) | SHLVL | 1 |
:question: | PATH | /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin |
✅ (2) | VIRTUAL_ENV | /home/jupyter/jupyterenv |
✅ (2) | NOTEBOOK_FOLDER_FINAL | /home/jupyter/notebooks |
✅ (2) | NOTEBOOK_FOLDER_INITIAL | /home/jupyter/notebook-defaults |
✅ (3) | HOME | /root |
create_image.py
entrypoint.py
before subprocess.Popen()
Running entrypoint yields
os.setresgid(gid, gid, gid) PermissionError: [Errno 1] Operation not permitted
I assume after calling os.setresuid()
the process potentially is no longer permitted to call os.setresgid()
.
Changing the order to the following helped:
os.setgroups([self.docker_group.id])
os.setresgid(gid, gid, gid)
os.setresuid(uid, uid, uid)
It turned out that at least in the CI build the code running inside the Docker container did modify the (group) owner of the original file on the host system.
Updated plan for fixing the notebook tests:
entrypoint.py
tostat
for docker.sock rw
permissionsgroupmod -g <gid> docker
The last approach finally seemed to be successful.
Additional integration tests for gid existing / non-existing group are tracked by ticket
In ticket #66 the user for running jupyter was changed from
root
tojupyter
with reduced permissions.Running the notebook-tests later on revealed failures. https://github.com/exasol/ai-lab/actions/runs/8112297755/job/22174172457
Additionally,