Open m-prodan opened 1 year ago
Check to see if you own OCR library can highlight search terms in the redaction app.
Try scanned PDFs and handwritten notes.
I have uploaded my analysis notes above. Moving this task to Review and we will wait for the PDFTron demo before making a final decision
Need to analyze PDFTON's Backend OCR library and other Tessaract Front end options. Need to re-estimate as we start this task, moving back to Product Backlog cc: @nkan-aot , @m-prodan
New thought as discussed, with using Scanner/Hardware with any availble software to Scan directly to PDF(a must!) and #2, if possible convert those PDFs to searchable PDFs , so that we can use it on our DocReviewer App. Need to try out things on FISGARD office on machine that has access to Scanning Team's scanners and better to use their exact machine to try out options utilizing their systems resources - in other words, to do this my IDIR need access to those scanners, work stations cc: @lmullane , @m-prodan
Here is additional analysis after meeting with PDFTron OCR comparison.xlsx python POC's have been uploaded to ms teams dev channel folder tesseract.py POC has been pushed to branch dev-NK-4238
Title of ticket:
Description
This is a key dependency for #4120 - we know PDFtron uses a third party OCR library, can we do the same and make it work to OCR image files?
Dependencies
Are there any dependencies?
DOD