Open muppie opened 3 years ago
Yes, I face similar issue with my scanner/printer. I'm trying to scan multiple pages, and the printer writes and append each page to the same PDF file. However, before I could scan and append the 2nd page, Paperless had consumed the pdf and my printers writes another new PDF.
Suggest feature: 1) Allows an option to set minutes, hours, day or time of day to trigger task to consume documents. 2) A button ( at dashboard? ) to manually trigger the consume task.
For the polling implementation, there already exists a mechanism that checks if the "file has settled": _consume_wait_unmodified
As a quick workaround I've switched to the polling mode for now.
Oh, my problem is a bit different, although still fixed by using the poller. I expose the consume folder via samba and use Finder (macOS) to copy in the files. For some reason this will first create an empty file, which will trigger the inotify part, before actually filling this file.
Not sure if my problem, is actually something to be fixed/workarounded in paperless...
For the polling implementation, there already exists a mechanism that checks if the "file has settled": _consume_wait_unmodified
As a quick workaround I've switched to the polling mode for now.
May I know how do you switch to the polling mode?
By using PAPERLESS_CONSUMER_POLLING
I would try playing around with
PAPERLESS_CONSUMER_POLLING_RETRY_COUNT
and PAPERLESS_CONSUMER_POLLING_DELAY
in addition to PAPERLESS_CONSUMER_POLLING
. I found the delay env variable when sifting through the code and it doesn't look like it's documented anywhere but it did seem like it delayed attempting to ingest files by x number of seconds after it found a new one in the directory.
I had trouble with this too. My scanner (Canon MB5450) creates an emtpy file first and then scans the page, appends it to the file, rinse and repeat for each page.
Without polling the import is completely borked, with polling it accepts the emtpy file as "unchanged" too early before the scanner manages to save the first page. I settled to these parameters:
PAPERLESS_CONSUMER_POLLING: "5"
PAPERLESS_CONSUMER_POLLING_DELAY: "30"
While this makes importing anything but instant, importing documents is 100% stable for me now. As far as I can see the PAPERLESS_CONSUMER_POLLING_DELAY specifies how long the importer waits after each modification of the modified timestamp of the file. In my case, if the mtime doesn't change after 30 seconds, paperless assumes the file to be finished. If it does change, it waits 30 seconds again and repeats this process.
Hi, Is there a way to add a delay before the consume folder kicks in? I upload all documents via Nextcloud and then consume into Paperless-ng but sometimes the file is only partly uploaded before Paperless-ng finds it.
If it not possible yet, I would like to add this as a feature :)
/M