WASABI Sandbox: Permission denied on Result delete

Tointoin commented 4 years ago

On sandbox instance @yomguy: While deleting a Result from admin in browser, I got this Permission error since add of pre_delete signal:

Traceback:
...
File "/usr/local/lib/python3.7/dist-packages/django/dispatch/dispatcher.py" in <listcomp>
  175.             for receiver in self._live_receivers(sender)

File "/srv/lib/timeside/timeside/server/models.py" in result_pre_delete
  936.         os.remove(instance.hdf5.path)

Exception Type: PermissionError at /admin/timeside_server/result/
Exception Value: [Errno 13] Permission denied: '/srv/media/results/07836d4f-72d0-4a66-a076-f6f06cce572c/ca605c05-0d1e-49c8-92ca-d4f0bd0bac01.hdf5'

Is their a way to give the needed permission on Result's file creation ? Is API's linux user is www-data?

Tointoin commented 4 years ago

After triing a chmod -R 666 on /srv/media/results/<item_uuid>/ directory, the error on a result deletion does not appear anymore but the newly created result hdf5 file is not readable anymore by the API: https://sandbox.wasabi.telemeta.org/timeside/api/items/dadd3e87-b585-4650-98ff-48a76fc4d71a/waveform/ This URL returns also a Permission denied

gnuletik commented 4 years ago

I see three way of dealing with this:

[Change in worker code / all processors ?] The processor that is creating the file set the 666 permission on the file.
[Change in server's run.sh] The django app is run as root (not recommended for security reasons).
[Change in worker and server Dockerfiles] The worker container (creating the file) and the server container (reading the file) should share the same uid (should be done in Dockerfile).

The first solution seems the easiest one but I don't know if this implies changes in every processors. If yes, we may share the same uid for both containers.

Tointoin commented 4 years ago

[Change in worker code / all processors ?] The processor that is creating the file set the 666 permission on the file.

HDF5 files are created in this line of timeside.core.analyzer with h5py package.
Encoded audio file output are directly passed as encoder's __init__ arg.
Grapher file path are passed in render method's arguments

Given the multiplicity of cases and the fact that core should leave without server, it does not seem appropriate to set permission there.

[Change in server's run.sh] The django app is run as root (not recommended for security reasons).

[Change in worker and server Dockerfiles] The worker container (creating the file) and the server container (reading the file) should share the same uid (should be done in Dockerfile).

Worker and app uid are both set to www-data using app_base.sh. I don't see any run.sh.

www-data has already the right permissions: I don't see why it raises this error :bow:

gnuletik commented 4 years ago

Hi @Tointoin,

I gave it another look and:

Yes, files are created at different places. It can be cumbersome to change it everywhere. Let's drop this option.
The worker and server containers use the same Dockerfile. So they share the same user.

However, the lines you mentionned inside app_base.sh, have a comment on top of them: # uwsgi params. I guess uwsgi is only used by the server.

The startup script for the server, uses the uid and gid variables.

However, the celery startup script do not use these parameters. It seems that you can pass a uid and gid parameters to celery to set the user.

Can you try passing these parameters to celery inside worker.sh (using the already-defined variables) ? If that works, we should edit the comment inside app_base.sh to state that uid and gid are also used by the worker.

Thanks!

Tointoin commented 4 years ago

Alright, thanks for your help @gnuletik.

After applying this patch on sandbox:

diff --git a/app/bin/worker.sh b/app/bin/worker.sh
index 93b21870..217ace6f 100755
--- a/app/bin/worker.sh
+++ b/app/bin/worker.sh
@@ -4,8 +4,8 @@ SCRIPT_DIR="$(dirname "$0")"
 source "$SCRIPT_DIR/app_base.sh"

 if [ $DEBUG = True ]; then
-    python3 manage.py timeside-celery-worker --loglevel $loglevel --logfile $worker_logfile
+    python3 manage.py timeside-celery-worker --loglevel $loglevel --logfile $worker_logfile --uid $uid --gid $gid
 else
-    celery -A worker worker --loglevel=$loglevel --logfile=$worker_logfile
+    celery -A worker worker --loglevel=$loglevel --logfile=$worker_logfile --uid $uid --gid $gid
 fi

I fixed this error on docker-compose restart worker: PermissionError: [Errno 13] Permission denied: '/var/log/celery/worker.log'

with:

chown -R www-data var/log/celery/
chgrp -R www-data var/log/celery/

After that I got other errors such as:

OSError: ('Error: gst-resource-error-quark: Could not open file "/srv/media/results/219e7cde-d997-4d17-ba1d-f7e1728abf62/f3f03345-fbb7-4706-9695-d37e744da07f.ogg" for writing. (6)', 'gstfilesink.c(431): gst_file_sink_open_file (): /GstPipeline:pipeline123/GstFileSink:filesink123:\nsystem error: Permission denied')

Should I apply:

chown -R www-data var/media/
chgrp -R www-data var/log/media/

???

@yomguy, does it seem ok for you to set worker's uid as www-data?

As @gnuletik mentioned, var/media/ is owned by root, it is not recommended to use a worker as root, is it?

Was it necessary to ensure Celery auto-reloader in dev mode for closed issue #113?

Tointoin commented 4 years ago

NB:

[x] check on a new instantiation of Timeside if log files are created with right permissions.

gnuletik commented 4 years ago

Hi @Tointoin, To sum up our small meeting about this issue :

The --uid and --gid seems to work as expected on new items (no more Permission Denied issues). You made the tests on the sandbox.
We should check that the issue is also gone on new containers / instances with empty data.
If there is no issues, we should migrate old data with chown / chgrp on the sandbox.

I gave a closer look on @yomguy's commit about live reload and found something: The celery worker is launched from a python script in this file: https://github.com/Parisson/TimeSide/blob/5877ddb27f0e0fa59890bdf68c34cf60abf78e8a/timeside/server/management/commands/timeside-celery-worker.py#L16

So, we may update the string to add the --uid and --gid parameters here. However, we may lose trace of this second start command as it is not in the same place as the startup script. It would be more clean if we could keep the celery's parameters in one place.

I can see two solutions:

Create a variable like celery_args that would be used by timeside-celery-worker.py and worker.sh
Refactor timeside-celery-worker.py to call worker.sh

The second one may be better but I'm not aware of all the implications.

Tointoin commented 4 years ago

Hi @gnuletik,

I gave a closer look on @yomguy's commit about live reload and found something: The celery worker is launched from a python script in this file:

https://github.com/Parisson/TimeSide/blob/5877ddb27f0e0fa59890bdf68c34cf60abf78e8a/timeside/server/management/commands/timeside-celery-worker.py#L16

So, we may update the string to add the --uid and --gid parameters here. However, we may lose trace of this second start command as it is not in the same place as the startup script. It would be more clean if we could keep the celery's parameters in one place.

Before keeping celery's parameters in one place, I also tried to modify this file to test on starting a new instance.

Passing www-data as uid and gid of worker raise this error: worker_1 | PermissionError: [Errno 13] Permission denied: '/var/log/app/timeside_debug.log'

This appears in both cases:

celery run in a Django command ($DEBUG = True)
celery run directly in worker.sh instead

The concerned file is defined in settings and should be created by Django logger.

yomguy commented 4 years ago

Hi @gnuletik and @Tointoin

During the 2 last devs of the celery worker start command, I have decided to not use the uid and gid options because it has indeed a important implication on media and log access rights. But this was mainly to be able to switch between the PROD and DEBUG mode without having permission issue between root and www-data users. The fact is that for some parts, the app and the worker containers can sometimes use the same dirs and then need to be run by the same user.

Of course, at the end, we should have the 2 containers using the same permissions and then managing access rights of each directory by hand. I propose to improve this now using the your both propositions. I will commit this during the day and update the sandbox. After that, you will have to manage access rights on your own if switching between the modes.

Tointoin commented 4 years ago

we have already worked on a patch yesterday with @gnuletik.

I will submit it to you @yomguy as a pr.

yomguy commented 4 years ago

OK @Tointoin then please merge my last commit.

Tointoin commented 4 years ago

OK @Tointoin then please merge my last commit.

I already did same changes so I rebased my branch on your commit before the PR.

Parisson / TimeSide

WASABI Sandbox: Permission denied on Result delete #180