sismics / docs

Lightweight document management system packed with all the features you can expect from big expensive solutions
https://teedy.io
GNU General Public License v2.0
1.98k stars 489 forks source link

importer: File with 0640 permission causes new document on every poll, but no files #707

Open madduck opened 1 year ago

madduck commented 1 year ago

I just put a PDF file into the imoprter directory (which has permissions 1777), but the file itself only had 0640. Since both the importer as well as Jetty run as root (ugh!), I didn't think this was going to be a problem. But that is the issue, changing permissions to 0644 makes it all work.

However, here is what happened while permissions were at 0640:

  1. Every 40 seconds, a new document would be created (presumably just based on the filename), but without any files (as they weren't readable, despite running as root?)
  2. The file in the import directory would not get removed, hence goto 1.

Here are the logs from the server:

Sep 07 13:04:19 paperless docker-compose[54300]: teedy-server_1   | 07 Sep 2023 13:04:19,788 INFO com.sismics.docs.core.listener.async.DocumentCreatedAsyncListener.on(DocumentCreatedAsyncListener.java:35) Document created event: DocumentCreatedAsyncEvent{documentId=3b4cea2c-0d13-4844-a07c-c3c67493a091}
Sep 07 13:04:49 paperless docker-compose[54300]: teedy-server_1   | 07 Sep 2023 13:04:49,369 INFO com.sismics.docs.core.listener.async.DocumentCreatedAsyncListener.on(DocumentCreatedAsyncListener.java:35) Document created event: DocumentCreatedAsyncEvent{documentId=d1ac2767-e23d-4ad8-a3f3-691690e085e2}
Sep 07 13:05:19 paperless docker-compose[54300]: teedy-server_1   | 07 Sep 2023 13:05:19,477 INFO com.sismics.docs.core.listener.async.DocumentCreatedAsyncListener.on(DocumentCreatedAsyncListener.java:35) Document created event: DocumentCreatedAsyncEvent{documentId=178de1ae-27af-4d67-9dcc-8eb90030e4e8}

The importer did not log anything.

And here is what the result looks like after 2 minutes (3 polls):

image

jendib commented 1 year ago

Your file is probably unreadable by Teedy. I will need a reproducer if possible (other people use the importer with no issue).

madduck commented 1 year ago

It's readable, assuming the importer is actually running as root and doesn't drop privileges, which I can't see the code doing:

~ # ps aux
PID   USER     TIME  COMMAND
    1 root      0:10 /teedy-importer -d
   43 root      0:00 ash
   75 root      0:00 ps aux
~ # ls -ld /proc/1
dr-xr-xr-x    9 root     root             0 Sep  7 07:12 /proc/1

As root myself, I can read the file and manipulate the directory just fine:

~ # cd /import/
/import # ls -l
total 412
-rw-r-----    1 1000     root        420317 Sep  7 16:16 2023.07.25 Saturn 143054.pdf
/import # id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)
/import # md5sum 2023.07.25\ Saturn\ 143054.pdf 
592df2ef2c2fa9f21e6744714469a531  2023.07.25 Saturn 143054.pdf
/import # rm 2023.07.25\ Saturn\ 143054.pdf 
/import # ls -l
total 0