indico / indico-containers

Containerization of Indico
27 stars 26 forks source link

Permissions issues on start #57

Open lawrenceleejr opened 3 weeks ago

lawrenceleejr commented 3 weeks ago

Thanks for this project -- I'm hoping it'll make setting up and maintaining a local indico instance more streamlined.

I'm on the current HEAD at 14ee107261fb11b7695e5f4eb119aab59c4391ef on a fresh install of alma 9, using a standard docker install. When I follow the instructions in the README and get to the docker compose up, it successfully installs everything but it seems when it tries to start the services, runs into some permissions errors. e.g.

indico-celery-1       | Traceback (most recent call last):
indico-celery-1       |   File "/usr/local/lib/python3.12/logging/config.py", line 608, in configure
indico-celery-1       |     handler = self.configure_handler(handlers[name])
indico-celery-1       |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
indico-celery-1       |   File "/usr/local/lib/python3.12/logging/config.py", line 876, in configure_handler
indico-celery-1       |     result = factory(**kwargs)
indico-celery-1       |              ^^^^^^^^^^^^^^^^^
indico-celery-1       |   File "/usr/local/lib/python3.12/logging/__init__.py", line 1231, in __init__
indico-celery-1       |     StreamHandler.__init__(self, self._open())
indico-celery-1       |                                  ^^^^^^^^^^^^
indico-celery-1       |   File "/usr/local/lib/python3.12/logging/__init__.py", line 1263, in _open
indico-celery-1       |     return open_func(self.baseFilename, self.mode,
indico-celery-1       |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
indico-celery-1       | PermissionError: [Errno 13] Permission denied: '/opt/indico/log/celery.log'

with similar errors from indico-celery-beat-1 and indico-web-1. indico-celery-beat-1 and indico-celery-1 then both exit. Digging around, it looks like /opt/indico/log/ in these containers is owned by root, so it's not surprising the user indico can't write to it. How can I deal with this, and why does this seem to work for others if these are permissions within the containers?

The full log is here: output.log

Thanks! -Larry

tomasr8 commented 3 weeks ago

Hi!

/opt/indico should be owned by the indico user:

https://github.com/indico/indico-containers/blob/14ee107261fb11b7695e5f4eb119aab59c4391ef/indico-prod/worker/Dockerfile#L29

Could you try clearing out all containers and volumes and trying again with docker compose up?

lawrenceleejr commented 2 weeks ago

Thanks @tomasr8! I just started from scratch and saw the same behavior. I took some notes that I hope are helpful.

docker rm -vf $(docker ps -aq)
docker rmi -f $(docker images -aq)

And confirm that there’s nothing around with a docker ps -a

[leejr@physastro indico-prod]$ docker exec prod-indico-web-1 ls -al /opt
total 0
drwxr-xr-x. 1 root   root   20 Oct 11 13:39 .
drwxr-xr-x. 1 root   root   17 Oct 16 15:56 ..
drwxr-xr-x. 1 indico indico 31 Oct 16 15:56 indico
[leejr@physastro indico-prod]$ docker exec prod-indico-web-1 ls -al /opt/indico
total 28
drwxr-xr-x. 1 indico indico   31 Oct 16 15:56 .
drwxr-xr-x. 1 root   root     20 Oct 11 13:39 ..
-rw-r--r--. 1 indico indico  220 Mar 29  2024 .bash_logout
-rw-r--r--. 1 indico indico 3526 Mar 29  2024 .bashrc
drwxr-xr-x. 1 indico indico   17 Oct 11 13:41 .cache
-rw-r--r--. 1 indico indico  807 Mar 29  2024 .profile
drwxr-xr-x. 1 indico indico   41 Oct 11 13:42 .venv
drwxr-x---. 2 indico indico    6 Oct 11 13:41 archive
drwxr-x---. 2 indico indico    6 Oct 11 13:41 cache
drwxr-xr-x. 2 root   root      6 Aug 29 14:36 custom
-rwxr-xr-x. 1 indico indico   62 Oct 11 13:37 docker_entrypoint.sh
drwxr-x---. 1 indico indico   45 Oct 16 15:56 etc
-rw-r--r--. 1 indico indico  281 Oct 11 13:43 indico.wsgi
drwxrwxr-x. 2 root   root     59 Aug 29 14:45 log
-rwxr-xr-x. 1 indico indico  288 Oct 11 13:37 run_celery.sh
-rwxr-xr-x. 1 indico indico  581 Oct 11 13:37 run_indico.sh
lrwxrwxrwx. 1 indico indico   64 Oct 11 13:43 static -> /opt/indico/.venv/lib/python3.12/site-packages/indico/web/static
drwxrwxrwx. 2 root   root     40 Oct 16 15:56 tmp

Docker host info:

[leejr@physastro indico-prod]$ more /etc/os-release
NAME="AlmaLinux"
VERSION="9.4 (Seafoam Ocelot)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="AlmaLinux 9.4 (Seafoam Ocelot)"
ANSI_COLOR="0;34"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"

ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9"
ALMALINUX_MANTISBT_PROJECT_VERSION="9.4"
REDHAT_SUPPORT_PRODUCT="AlmaLinux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.4"
SUPPORT_END=2032-06-01

Anything obvious I'm doing wrong here? Thanks!

tomasr8 commented 2 weeks ago

That seems correct to me, though when I try it I get the correct permissions. Could you try building just the worker image and see if the permissions are correct?

docker system prune -a 
cd indico-prod/worker
docker build -t indico .
docker run --rm -it indico ls -al /opt/indico
lawrenceleejr commented 2 weeks ago

Fascinating. That seemed to work fine...

[leejr@physastro worker]$
docker run --rm -it indico ls -al /opt/indico
total 28
drwxr-xr-x. 1 indico indico   76 Oct 17 18:33 .
drwxr-xr-x. 1 root   root     20 Oct 17 18:30 ..
-rw-r--r--. 1 indico indico  220 Mar 29  2024 .bash_logout
-rw-r--r--. 1 indico indico 3526 Mar 29  2024 .bashrc
drwxr-xr-x. 1 indico indico   17 Oct 17 18:32 .cache
-rw-r--r--. 1 indico indico  807 Mar 29  2024 .profile
drwxr-xr-x. 1 indico indico   41 Oct 17 18:33 .venv
drwxr-x---. 2 indico indico    6 Oct 17 18:32 archive
drwxr-x---. 2 indico indico    6 Oct 17 18:32 cache
-rwxr-xr-x. 1 indico indico   62 Oct 16 15:54 docker_entrypoint.sh
drwxr-x---. 2 indico indico    6 Oct 17 18:32 etc
-rw-r--r--. 1 indico indico  281 Oct 17 18:33 indico.wsgi
drwxr-x---. 2 indico indico    6 Oct 17 18:32 log
-rwxr-xr-x. 1 indico indico  288 Oct 16 15:54 run_celery.sh
-rwxr-xr-x. 1 indico indico  581 Oct 16 15:54 run_indico.sh
lrwxrwxrwx. 1 indico indico   64 Oct 17 18:33 static -> /opt/indico/.venv/lib/python3.12/site-packages/indico/web/static
drwxrwxrwx. 2 indico indico    6 Oct 17 18:32 tmp

What's different about building the image like this vs the whole stack?

tomasr8 commented 2 weeks ago

What's different about building the image like this vs the whole stack?

I'd like to know that as well :smile: I'll try to investigate tomorrow

tomasr8 commented 2 weeks ago

cc @OmeGak if you have any ideas

tomasr8 commented 2 weeks ago

I deleted all docker caches and reran docker compose up and I'm still not able to reproduce this..

docker system prune -a
cd indico-prod
docker compose up

Both the web and celery images have the correct permissions:

 troun@sg1  ~/dev/indico-containers/indico-prod   master ±  docker --version                                         
Docker version 27.0.3, build 7d4bcd8
 troun@sg1  ~/dev/indico-containers/indico-prod   master ±  docker run --rm -it prod-indico-web ls -al /opt/indico 
total 84
drwxr-xr-x 1 indico indico 4096 Oct 18 07:51 .
drwxr-xr-x 1 root   root   4096 Oct 18 07:48 ..
-rw-r--r-- 1 indico indico  220 Mar 29  2024 .bash_logout
-rw-r--r-- 1 indico indico 3526 Mar 29  2024 .bashrc
drwxr-xr-x 1 indico indico 4096 Oct 18 07:50 .cache
-rw-r--r-- 1 indico indico  807 Mar 29  2024 .profile
drwxr-xr-x 1 indico indico 4096 Oct 18 07:51 .venv
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 archive
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 cache
-rwxr-xr-x 1 indico indico   62 Oct 17 08:36 docker_entrypoint.sh
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 etc
-rw-r--r-- 1 indico indico  281 Oct 18 07:51 indico.wsgi
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 log
-rwxr-xr-x 1 indico indico  288 Oct 17 08:36 run_celery.sh
-rwxr-xr-x 1 indico indico  581 Oct 17 08:36 run_indico.sh
lrwxrwxrwx 1 indico indico   64 Oct 18 07:51 static -> /opt/indico/.venv/lib/python3.12/site-packages/indico/web/static
drwxrwxrwx 2 indico indico 4096 Oct 18 07:50 tmp
 troun@sg1  ~/dev/indico-containers/indico-prod   master ±  docker run --rm -it prod-indico-celery ls -al /opt/indico
total 84
drwxr-xr-x 1 indico indico 4096 Oct 18 07:51 .
drwxr-xr-x 1 root   root   4096 Oct 18 07:48 ..
-rw-r--r-- 1 indico indico  220 Mar 29  2024 .bash_logout
-rw-r--r-- 1 indico indico 3526 Mar 29  2024 .bashrc
drwxr-xr-x 1 indico indico 4096 Oct 18 07:50 .cache
-rw-r--r-- 1 indico indico  807 Mar 29  2024 .profile
drwxr-xr-x 1 indico indico 4096 Oct 18 07:51 .venv
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 archive
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 cache
-rwxr-xr-x 1 indico indico   62 Oct 17 08:36 docker_entrypoint.sh
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 etc
-rw-r--r-- 1 indico indico  281 Oct 18 07:51 indico.wsgi
drwxr-x--- 2 indico indico 4096 Oct 18 07:50 log
-rwxr-xr-x 1 indico indico  288 Oct 17 08:36 run_celery.sh
-rwxr-xr-x 1 indico indico  581 Oct 17 08:36 run_indico.sh
lrwxrwxrwx 1 indico indico   64 Oct 18 07:51 static -> /opt/indico/.venv/lib/python3.12/site-packages/indico/web/static
drwxrwxrwx 2 indico indico 4096 Oct 18 07:50 tmp
lawrenceleejr commented 2 weeks ago

So sad since this is exactly the kind of non-reproducibility containerization should prevent! 😄

Just confirming that our docker versions aren't wildly different or anything.

[leejr@physastro indico-prod]$ docker --version
Docker version 27.1.2, build d01f264

But maybe there's some small change that crept in. I'm checking on a machine (cc7) that has an older install of docker:

(base) [leejr@serf212a ~]$ docker --version
Docker version 26.1.4, build 5650f9b

And actually it looks like it works! I don't get that error about the permissions. Doesn't confirm that it's the docker version since this is a different host, but could be...

(base) [leejr@serf212a ~]$ docker run --rm -it prod-indico-celery ls -al /opt/indico
total 128
drwxr-xr-x. 1 indico indico 4096 Oct 11 14:15 .
drwxr-xr-x. 1 root   root   4096 Oct 11 14:05 ..
-rw-r--r--. 1 indico indico  220 Mar 29  2024 .bash_logout
-rw-r--r--. 1 indico indico 3526 Mar 29  2024 .bashrc
drwxr-xr-x. 1 indico indico 4096 Oct 11 14:11 .cache
-rw-r--r--. 1 indico indico  807 Mar 29  2024 .profile
drwxr-xr-x. 1 indico indico 4096 Oct 11 14:13 .venv
drwxr-x---. 2 indico indico 4096 Oct 11 14:10 archive
drwxr-x---. 2 indico indico 4096 Oct 11 14:10 cache
-rwxr-xr-x. 1 indico indico   62 Oct 11 14:03 docker_entrypoint.sh
drwxr-x---. 2 indico indico 4096 Oct 11 14:10 etc
-rw-r--r--. 1 indico indico  281 Oct 11 14:15 indico.wsgi
drwxr-x---. 2 indico indico 4096 Oct 11 14:10 log
-rwxr-xr-x. 1 indico indico  288 Oct 11 14:03 run_celery.sh
-rwxr-xr-x. 1 indico indico  581 Oct 11 14:03 run_indico.sh
lrwxrwxrwx. 1 indico indico   64 Oct 11 14:15 static -> /opt/indico/.venv/lib/python3.12/site-packages/indico/web/static
drwxrwxrwx. 2 indico indico 4096 Oct 11 14:10 tmp

I do need it to run on the original machine, but even on this other machine with the older docker, I get an indico page served at 9090 that says to go to 8080, which then gives no response. I've checked that the port is open on the firewall, but maybe this part of the problem is for another ticket...

tomasr8 commented 2 weeks ago

My Docker version is 27.0.3 so that should be close-enough to yours to not see such a difference in behaviour..

I do need it to run on the original machine, but even on this other machine with the older docker, I get an indico page served at 9090 that says to go to 8080, which then gives no response. I've checked that the port is open on the firewall, but maybe this part of the problem is for another ticket...

Just to make sure there isn't anything else getting in the way, can you post the output of git diff HEAD (in case there is some sensitive data, you can censor it). I want to make sure I am testing with the same configs

lawrenceleejr commented 2 weeks ago

Sure thing. The diff is below and it's just the secret key change, which I've redacted.

[leejr@physastro indico-containers]$ git diff HEAD
diff --git a/indico-prod/indico.conf b/indico-prod/indico.conf
index b3dad6d..dc79462 100644
--- a/indico-prod/indico.conf
+++ b/indico-prod/indico.conf
@@ -4,7 +4,7 @@ SQLALCHEMY_DATABASE_URI = f'postgresql://{os.environ["PGUSER"]}:{os.environ["PGP
 del os

 # Change this to something long and random
-SECRET_KEY = ''
+SECRET_KEY = 'XXXXXXXXXXXX'
 BASE_URL = 'http://localhost:8080'
 USE_PROXY = True

Does it seem important that in my Docker 26 test, the permissions were fine? Or do you think that version-dependence is a red herring?

Thanks!

lawrenceleejr commented 2 weeks ago

So on the problematic machine, I did a dnf downgrade to 27.0.3 (from 27.1.2), cleared everything out, and started from scratch just to check it. Ended up with the same behavior (as expected).

PermissionError: [Errno 13] Permission denied: '/opt/indico/log/celery.log'