paperless-ngx / paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents
https://docs.paperless-ngx.com
GNU General Public License v3.0
22.39k stars 1.23k forks source link

[BUG] original consume path lost while splitting PDF scans at patch pages #8392

Closed gsdeejay closed 2 days ago

gsdeejay commented 2 days ago

Description

A workflow that should trigger on the name of a consume subdirectory will not trigger when working with a PDF containing patch pages. The separation, as the rest of the archiving process, works perfectly, but the workflow can't be found, as the separated files are stored in the temp directory. See the relevant paperless logs below.

Steps to reproduce

  1. Scan multiple pages with path pages between
  2. Document is separated, but parts are written to a temp directory
  3. Workflow not triggered because of the different directory name

Webserver logs

[...]
[INFO] [paperless.management.consumer] Adding /opt/paperless/consume/<username>/20241130123638.pdf to the task queue.
[DEBUG] [paperless.tasks] Skipping plugin CollatePlugin
[DEBUG] [paperless.tasks] Executing plugin BarcodePlugin
[DEBUG] [paperless.barcodes] Scanning for barcodes using PYZBAR
[DEBUG] [paperless.barcodes] PDF has 13 pages
[DEBUG] [paperless.barcodes] Processing page 0
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/0055ee44-0255-4ce2-a034-969139493bdf-01.ppm
[DEBUG] [paperless.barcodes] Processing page 1
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/3b12f568-8b6a-4522-9dcc-cc3966daa480-02.ppm
[DEBUG] [paperless.barcodes] Processing page 2
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/9d98ef4f-6ac2-4626-8740-8dd9fb2dc950-03.ppm
[DEBUG] [paperless.barcodes] Processing page 3
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/b9c5d1ab-2011-4542-9a2e-25992b7daca4-04.ppm
[DEBUG] [paperless.barcodes] Processing page 4
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/3d9888fa-ef6e-4b44-841f-a059829476c9-05.ppm
[DEBUG] [paperless.barcodes] Barcode of type CODE39 found: PATCHT
[DEBUG] [paperless.barcodes] Processing page 5
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/e0272563-968c-4849-94f3-d13e52a77796-06.ppm
[DEBUG] [paperless.barcodes] Processing page 6
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/18152b14-deec-4dfc-84f9-eb942cded688-07.ppm
[DEBUG] [paperless.barcodes] Processing page 7
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/0ab33663-8150-4184-ac51-d20180ec0c9c-08.ppm
[DEBUG] [paperless.barcodes] Barcode of type CODE39 found: PATCHT
[DEBUG] [paperless.barcodes] Processing page 8
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/14dd28d1-af92-47fd-a54b-ba97539ea5d5-09.ppm
[DEBUG] [paperless.barcodes] Processing page 9
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/59eb384f-1215-4405-a607-3723b016bb91-10.ppm
[DEBUG] [paperless.barcodes] Processing page 10
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/e3197da0-613c-437a-b8e1-b94591bbcc32-11.ppm
[DEBUG] [paperless.barcodes] Processing page 11
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/531f87b0-953b-4009-889e-a6d1ba7a310b-12.ppm
[DEBUG] [paperless.barcodes] Processing page 12
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmpgmkiy024/barcoderhwvsp37/acf35ed3-b237-44fc-bb54-3791c0d4601f-13.ppm
[DEBUG] [paperless.barcodes] Starting new document at idx 4
[DEBUG] [paperless.barcodes] Starting new document at idx 7
[DEBUG] [paperless.barcodes] Split into 3 new documents
[DEBUG] [paperless.barcodes] pdf no:0 has 4 pages
[DEBUG] [paperless.barcodes] pdf no:1 has 2 pages
[DEBUG] [paperless.barcodes] pdf no:2 has 5 pages
[INFO] [paperless.barcodes] Created new task fe9fadae-571c-4538-bbe1-bb92a5a3661e for 20241130123638_document_0.pdf
[INFO] [paperless.barcodes] Created new task caa4c911-e26f-4fc8-ab42-42a6d5f3afc0 for 20241130123638_document_1.pdf
[INFO] [paperless.barcodes] Created new task a18bfb5f-6b0f-48e1-b44d-9ddce5af183b for 20241130123638_document_2.pdf
[INFO] [paperless.tasks] BarcodePlugin requested task exit: Barcode splitting complete!
[DEBUG] [paperless.tasks] Skipping plugin CollatePlugin
[DEBUG] [paperless.tasks] Executing plugin BarcodePlugin
[DEBUG] [paperless.barcodes] Scanning for barcodes using PYZBAR
[DEBUG] [paperless.barcodes] PDF has 4 pages
[DEBUG] [paperless.barcodes] Processing page 0
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmp8w5_5fbz/barcodewr4s3jf0/b573d4af-e8fc-4a50-8378-03bcb465d5b4-1.ppm
[DEBUG] [paperless.barcodes] Processing page 1
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmp8w5_5fbz/barcodewr4s3jf0/1fb25ee9-6268-4020-bf0a-8e611503b175-2.ppm
[DEBUG] [paperless.barcodes] Processing page 2
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmp8w5_5fbz/barcodewr4s3jf0/be850e94-7427-4114-9c60-5106390c262b-3.ppm
[DEBUG] [paperless.barcodes] Processing page 3
[DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmp8w5_5fbz/barcodewr4s3jf0/c94d35f6-0e34-4292-aff5-d0cf9711fbc2-4.ppm
[INFO] [paperless.tasks] BarcodePlugin completed with no message
[DEBUG] [paperless.tasks] Executing plugin WorkflowTriggerPlugin
[INFO] [paperless.matching] Document did not match Workflow: <Username>s Files
*** here we fail [DEBUG] [paperless.matching] ('Document path /tmp/paperless/paperless-barcode-split-ukbdnefe/20241130123638_document_0.pdf does not match /opt/paperless/consume/<username>/*',)
[INFO] [paperless.tasks] WorkflowTriggerPlugin completed with: 
[DEBUG] [paperless.tasks] Executing plugin ConsumeTaskPlugin
[INFO] [paperless.consumer] Consuming 20241130123638_document_0.pdf
[...]

Browser logs

No response

Paperless-ngx version

2.13.5

Host OS

Debian 6.1.119 bookworm

Installation method

Bare metal

System status

No response

Browser

No response

Configuration changes

No response

Please confirm the following

shamoon commented 2 days ago

This is a known limitation, unfortunately.

Feel free to open a feature request