Closed Chavell3 closed 5 months ago
also interesting if I try to manually run the OCR detection there is no logging about a new task on the worker node... I would have expected to get a new task...
BUT on the WEB service I see the following log... and something seems to be wrong there... PM_web_log.txt
@Chavell3
Does it happen for all PDF, JPG images you've tried? Or only for some of them? Would you mind attaching one problematic file (one pdf and one jpg) to this ticket so that I can troubleshoot it?
Check to see if the file uploaded completely. Also check to see that the file is actually pdf or jpg. I had one file rejected because it had the wrong extension (three letters after the dot in the file name), and a few that didn't completely upload when I tried uploading 30 at a time.
I don't think it's a matter of a specific file, I now uploaded like 6 additional files(PDF's and JPEG's) non of them is scanned... Any idea what I could do to additionally troubleshoot that?
small side note, although I entered the volumes for media and database within the compose file... those folders still stay empty... I added a picture for that. If you still like to have some files, just give me shout... but I don't think it's file related...
Thanks for the help.
Run following command in worker container:
/usr/bin/file --mime-type -b <path-to-pdf-or-jpg-file-you-have-uploaded>
e.g.
/usr/bin/file --mime-type -b /core_app/media/docvers/67/88/67883da5-2626-4d8a-9cbd-e861abce863c/1706648353180682379972750067052.jpg.pdf
and tell me the result here
okay... already the folder "media" does not exist under /core_app
But my fault... wait let me test something...
I now added the docker volumes manually to mount those to my wanted folders.
But it still seems not to mount those volumes correctly. I do found the issue... which is, that the worker and web node are not mounting the media volume correctly somehow, although it is listed when running "df"
Worker-Node:
Web-Node:
Somehow the web node has access to such volume but the worker node does not... Interesting is, that /dev/md0 is my raid device where I want the files to be safed but I want to choose some subfolder, DMS/papermerge/..
The storage configuration within docker compose configuration looks like that:
But also there, nowhere just /dev/md0 is defined... it's always some subfolder(either "docker" or "DMS")...
It seems like "/dev/md0" is somehow just the naming, but it is correctly mounted to my subfolder within the directory. Because when I browser the container's FS and compare that with the local FS where it should be located those files are correct.
That maybe means, somehow the worker node seems not to be able to mount the volume "MEDIA" because of some permission stuff... and same for the WEB node because it does not create any file within that folder...
I think the issue is, that the folder "media" under /core_apps does not exist. So it cannot mount that volume under that directory. When I start the WEB node and login into the container, the folder "media" also does not exist.
BUT the difference is, it seems the WEB node does create that folder when the first file is uploaded. While the WORKER node tried to read from a directory that just does not exist, because it was never created or correctly mounted...
I created that folder "media" for the WEB and WORKER node and stopped and started them again but unfortunatly it still did not mount the volume by the looks of it, because I still can't see data that has now been created unter /core_apps/media
but even if I create that folder, build a new repository from that running container(with the "media" folder) and rebuild my hole papermerge environment, it does not seem to work because still on the host all created files in that volumes are not visible or does not exist on the host....
OKAY shame on me... all my fault... first tried to directly mount the host folders and messed the config there. After I fixed that, I did wrote "/core_apps" instead of "/core_app"... After I corrected that, now everything works as expected.
Hi Team,
after a bunch on tries I could now successfully set up Papermerge. So by the looks of it, all connections are working between each of the instances... but when I upload a file and try to run OCR manually it fails. Within the logs of the Worker node I see a message of "unsupported format" and it can be a PDF or JPG file which both are supported.
But by the documentation, PDF and also JPG file should work.
docker compose.txt PM_worker_log.txt
Any idea what I could change to make it work?
Info:
Thanks!