naiveHobo / InvoiceNet

Deep neural network to extract intelligent information from invoice documents.
MIT License
2.46k stars 389 forks source link

Extracting line/tabular data from Invoices #62

Open Vedant-Tibrewal opened 3 years ago

Vedant-Tibrewal commented 3 years ago

@naiveHobo All the fields targeted for extraction contains single value only (like total_amount, vendor_name etc.) is there any way to capture the tabular data?

lucky-verma commented 3 years ago

@Vedant-Tibrewal some useful links:

  1. https://github.com/tomassosorio/OCR_tablenet
  2. https://arxiv.org/abs/2001.01469

I think these may solve your issue.

r-toroxel commented 2 years ago

@Vedant-Tibrewal Though not a complete solution for tabular data, you can try the OPTIONAL data type if you know that lists/tables won't be longer than a (small) limit.