Closed osrjv closed 1 year ago
Closing this as won't fix since it was decided that for image-based PDFs (and other such documents where text search is not enough), we should go through a more accurate AI trained model using RPA.DocumentAI
and get out structured info.
Let's sync with @tonnitommi first if we still think this effort should be supported by ourselves with local help from Tesseract and rpaframework-recognition
as implied.
Currently, template matching and OCR are only supported in a desktop automation context, but a library (probably
RPA.Images
) should be made that supports the same functionality but with a user-given document instead.