Support image-based automation with PDFs (and others)

robocorp / rpaframework

Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python

https://www.rpaframework.org/

Apache License 2.0

1.17k stars 227 forks source link

Support image-based automation with PDFs (and others) #103

Closed osrjv closed 1 year ago

osrjv commented 3 years ago

Currently, template matching and OCR are only supported in a desktop automation context, but a library (probably RPA.Images) should be made that supports the same functionality but with a user-given document instead.

cmin764 commented 1 year ago

Closing this as won't fix since it was decided that for image-based PDFs (and other such documents where text search is not enough), we should go through a more accurate AI trained model using RPA.DocumentAI and get out structured info.

Let's sync with @tonnitommi first if we still think this effort should be supported by ourselves with local help from Tesseract and rpaframework-recognition as implied.