ciur / papermerge

Open Source Document Management System for Digital Archives (Scanned Documents)
https://papermerge.com
Apache License 2.0
2.48k stars 261 forks source link

Configure Workers to annotate on Import #382

Open jwalzer opened 3 years ago

jwalzer commented 3 years ago

Is your feature request related to a problem? Please describe. I would like to have multiple workers running, each monitoring a different IMPORTER_DIR (probably on different hosts) The imported Documents should then be classified depending on their Input-Location

Describe the solution you'd like Optimal solution could be, that a worker could/should tag a document automatically with some meta-information, like:

optionally:

Describe alternatives you've considered I thought about completely releasing the idea of the IMPORTER_DIR and solving it via shellscripting like in https://github.com/Ryther/papermerge-importer but I see value in having this implemented into papermerge

Additional context Having importer metadata that can be queried in the automatas allows for lot more sophisticated workflows and usecases

ciur commented 3 years ago

Actually you can run multiple workers with different importer directory each. Notice that IMPORTER_DIR is worker specific configuration i.e it can differ from worker to worker.

What is not there, and I consider it a good idea is the "imported Documents should then be classified depending on their Input-Location". At least tagging docs differently depending of their origin - would be nice.

Thank you for opening this feature request.

jwalzer commented 3 years ago

Yes, my main request is the tagging. Because the worker, communicating via queue/redis, there shouldn't be any issue concerning syncronisation. But multiple Workers will allow to have multiple ingestion points. Allowing to configure every worker with some dedicated tags (maybe even freeform) can also delegate worker setup in a highly distributed environment. Different People can setup their workers, with the tags they are using. Only thing missing would be to have a secure way to determine the user into which to inject the document.

mutax commented 3 years ago

My scenario is a scanner that has three quick-scan buttons that put the document to different directories on my samba share. This way with tagging based on the source directory I can already pre-classify the document. Would be very useful here!

ciur commented 3 years ago

Hi @jwalzer, thank you for your kind donation. The feature your are asking for will make its way into Papermerge 2.1. However, please keep in mind that Papermerge 2.1 is scheduled for December 2021. I intentionally decided to spend more time developing next release to address all accumulated technical debt. In any case I assure you that it is worth waiting :wink:

jwalzer commented 3 years ago

no problem. The donation is for the job done so far ;) If you need some friendly tester for the features, drop me a note