[Feature] - Templates for OCR (Zonal OCR) using KULL

I already saw your pinned post about automation. Looks great! I exactly need a solution which automatically sorts, and tags my PDFs accrding to their content.

Now i have a feature request: Many of the expensive solutions offers Zonal OCR. They implemented something called OCR-Templates. This is nothing else than just a file which defines several boxes where OCR searches for Text. One possibility to select such zones is this project:

https://jsoma.github.io/kull https://github.com/jsoma/kull

I have also recorded a short gif templateSelection

Now the trick: There are multiple zones defined

unamed0
unamed1 and so on.

If we have the possibility to do RegEx or any other StrComp function on every zone itself, we would have an extremly powerfull detection engine.

AND:

If we have the possibility to use some of the content of these fields as metadata, we would have one of the most powerfull intelligent classification engine out there...

What do you think? I have not looked through the code, but if possible, i would like to help to implement this feature.

Thank you very much.

the-paperless-project / paperless

[Feature] - Templates for OCR (Zonal OCR) using KULL #701