jonaswinkler / paperless-ng

A supercharged version of paperless: scan, index and archive all your physical documents
https://paperless-ng.readthedocs.io/en/latest/
GNU General Public License v3.0
5.37k stars 355 forks source link

Permission error with NFS volumes in docker #365

Closed FleischKarussel closed 3 years ago

FleischKarussel commented 3 years ago

Hello everyone,

I'm fiddling around this issue for quite some time and don't have any clue why this happens only with paperlessng as I am running a bunch of services in containers side-by-side in the same manner completely fine. Please see below, it seems like the permissions seem fine when container starts for the first time when initiating the volume but fails afterwards. I even tried 777 for data and media folder on the nfs server. Using local volumes works fine.

Is this an issue on my side or a bug in the directory hopping (/../ ?)

Best regards FK

docker logs:

Hit:1 http://deb.debian.org/debian buster InRelease
Hit:2 http://deb.debian.org/debian buster-updates InRelease
Hit:3 http://security.debian.org/debian-security buster/updates InRelease
Reading package lists...
package tesseract-ocr-deu already installed!
creating directory ../data/index
creating directory ../media/documents
creating directory ../media/documents/originals
creating directory ../media/documents/thumbnails
Waiting for PostgreSQL to start...
SystemCheckError: System check identified some issues:

ERRORS:
?: PAPERLESS_DATA_DIR is not writeable
        HINT: Set the permissions of /usr/src/paperless/src/../data to be writeable by the user running the Paperless services
?: PAPERLESS_MEDIA_ROOT is not writeable
        HINT: Set the permissions of /usr/src/paperless/src/../media to be writeable by the user running the Paperless services

Filesystem permission in container:

docker exec -it paperlessng_webserver ls -ln /usr/src/paperless/
total 32
drwxr-xr-x 2 1000 1000 4096 Jan 15 12:27 consume
drwxrwxrwx 1 1000 1000   38 Jan 15 12:34 data
drwxr-xr-x 2 1000 1000 4096 Jan 15 12:27 export
-rw-r--r-- 1 1000 1000 1241 Dec  9 16:52 gunicorn.conf.py
drwxrwxrwx 1 1000 1000   18 Jan 15 12:34 media
-rw-r--r-- 1 1000 1000 2252 Dec  9 16:51 requirements.txt
drwxr-xr-x 1 1000 1000 4096 Dec 22 14:56 src
drwxr-xr-x 1 1000 1000 4096 Dec 22 14:56 static

Filesystem permissions on NFS server:

ls -l paperlessng-data/* paperlessng-media/*
-rw-r--r-- 1 1000 1000  0 Jan 15 13:37 paperlessng-data/migration_lock

paperlessng-data/index:
total 0

paperlessng-media/documents:
total 0
drwxr-xr-x 1 1000 1000 0 Jan 15 13:34 originals
drwxr-xr-x 1 1000 1000 0 Jan 15 13:34 thumbnails

ls -ld paperlessng-data/ paperlessng-media/
drwxrwxrwx 1 1000 1000 38 Jan 15 13:34 paperlessng-data/
drwxrwxrwx 1 1000 1000 18 Jan 15 13:34 paperlessng-media/
jonaswinkler commented 3 years ago

Please try to write to these directories from within the paperless container as the paperless user. If this works, this is an issue with how paperless tries to determine whether or not it can write to a given directory. If not, see below.

Paperless does the volume initialization as root, therefore, it works. Paperless also changes ownership of these files to the paperless system user, so that paperless can write there.

FleischKarussel commented 3 years ago

Hey @jonaswinkler,

thanks for your reply! In fact, I am running with UID and GID mapping to 1000:1000. I manually created the directory that is being mounted/mapped to the volume on the docker host with 1000:1000, just to be sure. Do the mount options matter in that regard?

The test you described looked like this, and imho worked as expected. I hope I got it right.

docker exec -u 1000 -it paperlessng_webserver touch /usr/src/paperless/data/test.datei

docker exec -it paperlessng_webserver ls -ln /usr/src/paperless/data
total 0
drwxr-xr-x 1 1000 1000 0 Jan 15 12:34 index
-rw-r--r-- 1 1000 1000 0 Jan 15 16:23 migration_lock
-rw-r--r-- 1 1000 1000 0 Jan 15 16:20 test.datei

docker exec -it paperlessng_webserver ls -l /usr/src/paperless/data
total 0
drwxr-xr-x 1 paperless paperless 0 Jan 15 12:34 index
-rw-r--r-- 1 paperless paperless 0 Jan 15 16:23 migration_lock
-rw-r--r-- 1 paperless paperless 0 Jan 15 16:20 test.datei

docker exec -it paperlessng_webserver df -h /usr/src/paperless/data
Filesystem                                         Size  Used Avail Use% Mounted on
:/volume2/container-data/develop/paperlessng-data  855G   93G  762G  11% /usr/src/paperless/data
jonaswinkler commented 3 years ago

In fact, I am running with UID and GID mapping to 1000:1000.

That's the default. No need to configure that, it will change nothing.

The test you described looked like this, and imho worked as expected. I hope I got it right.

That means that the python process in paperless is either not able to write there, or paperless fails to properly identify whether or not it's able to write there. Please do the following tests.

Bring up a python shell inside the container:

$ docker exec -u 1000 -it paperlessng_webserver /usr/local/bin/python3
Python 3.7.9 (default, Oct 13 2020, 21:10:49) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Do the following commands and post the output:

>>> import os
>>> os.access('../data', os.W_OK | os.X_OK)
[Output I'm interested in]
>>> os.stat('../data')
[Output I'm interested in]

Enter exit() or hit CTRL+D to exit.

FleischKarussel commented 3 years ago

Hello again,

thanks for pointing out the UID/GID Mapping settings. It took me some time to figure out how to prevent the container from restarting. ;)

>>> import os
>>> os.access('../data', os.W_OK | os.X_OK)
True
>>> os.stat('../data')
os.stat_result(st_mode=16895, st_ino=2090429, st_dev=1048660, st_nlink=1, st_uid=1000, st_gid=1000, st_size=58, st_atime=1610739618, st_mtime=1610727603, st_ctime=1610739618)
jonaswinkler commented 3 years ago

I'm puzzled.

os.access('../data', os.W_OK | os.X_OK)

That's exactly the same check that paperless does to see if it can write to that directory. And when executed manually, it aparrently works, but when paperless does that, it fails and yields the error messages you're experiencing.

https://github.com/jonaswinkler/paperless-ng/blob/d8637ff4b16a0e583632af53c193b1c3aac06ee1/src/paperless/checks.py#L24

FleischKarussel commented 3 years ago

I replaced the entrypoint with /usr/bin/sleep 600 to have time for the tests. May I broke the tests by doing this?

oliver-la commented 3 years ago

I can confirm this happens with NFS shares. (why is NFS always the culprit if a container chokes on a volume?)

Manually setting PAPERLESS_CONSUMPTION_DIR to /consume does not make a difference. So the paperless permission check shouldn't be having issues with the ../ in the path.

What strikes me odd is that my installation only had issues with the consumption directory:

webserver_1  | Paperless-ng docker container starting...
webserver_1  | Waiting for PostgreSQL to start...
webserver_1  | Apply database migrations...
webserver_1  | SystemCheckError: System check identified some issues:
webserver_1  |
webserver_1  | ERRORS:
webserver_1  | ?: PAPERLESS_CONSUMPTION_DIR is not writeable
webserver_1  |  HINT: Set the permissions of /usr/src/paperless/src/../consume to be writeable by the user running the Paperless services

PAPERLESS_DATA_DIRand PAPERLESS_MEDIA_ROOTappear to be perfectly writable.

In my case, I have a webdav container to add files to the consume directory.

  webdav:
    image: xama/nginx-webdav
    volumes:
      - ./consume:/var/webdav/public
    environment:
      - "WEBDAV_USERNAME=webdav"
      - "WEBDAV_PASSWORD=webdav"
      - "TZ=Europe/Zurich"
    restart: unless-stopped
    expose:
      - 80

Maybe the error only appears if two containers share the same directory and perhaps put a lock on it?

As a workaround, I switched to a named volume. (default driver) This made the error disappear.

This issue started to appear when I updated my container from 0.9.14 to latest. Reverting to 0.9.14 did not fix the issue.

I hope my findings are of any help.

Edit; Here's my fstab. Maybe nolock is causing issues?

192.168.0.165:/volume1/docker /volume1/docker nfs _netdev,nolock,nfsvers=3,proto=udp 0 0

I remember that my Plex installation also had loooots of issues with the nfs share because locks do not work properly with NFS. (even with the option removed)

jonaswinkler commented 3 years ago

Checked for related changes between 0.9.14 and 1.0.0 and there's nothing.

PAPERLESS_DATA_DIRand PAPERLESS_MEDIA_ROOTappear to be perfectly writable.

If these are not stored on NFS volumes, that would explain that.

FleischKarussel commented 3 years ago

@oliver-la I see you maybe also use a Synology NAS (Xpenology in my case)? Just guessing because of the path /volume1/...

Just checked, I also have nolock enabled. Tried without nolock, another error ;)

Get:1 http://deb.debian.org/debian buster InRelease [121 kB]
Get:2 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
Get:3 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
Get:4 http://security.debian.org/debian-security buster/updates/main amd64 Packages [270 kB]
Get:5 http://deb.debian.org/debian buster/main amd64 Packages [7907 kB]
Get:6 http://deb.debian.org/debian buster-updates/main amd64 Packages [7860 B]
Fetched 8424 kB in 2s (3732 kB/s)
Reading package lists...
package tesseract-ocr-deu already installed!
Waiting for PostgreSQL to start...
flock: 200: No locks available