invoice-x / invoice2data

Extract structured data from PDF invoices
MIT License
1.8k stars 476 forks source link

Use DocTR & PaddleOCR for OCR #526

Open princeharish opened 1 year ago

princeharish commented 1 year ago

Nowadays DocTR and PaddleOCR has gained immense popularity and its accurate as well and In most cases they wont even need preprocessing of images or pdf to read OCR data. That along with invoice2data would be a deadly combination.

bosd commented 1 year ago

Using some form of trained text detection is very interesting!!