invoice-x / invoice2data

Extract structured data from PDF invoices
MIT License
1.8k stars 476 forks source link

Extract summary lines from table groups #567

Open daniele123321 opened 1 week ago

daniele123321 commented 1 week ago

Hello, I am trying to extract data from the following invoice:

Date       Time      Product            Location        Price
25/04/2024 06:22     TYPE2      MILAN       9.54
26/04/2024 23:37     TYPE2      ROME        10.67
Total for product : TYPE2 20.21
20/06/2024 07:07     TYPE1      MILAN       497.33
21/06/2024 12:25     TYPE1      ROME        572.39
21/06/2024 12:42     TYPE1      ROME        289.83
Total for product : TYPE1 1,369.55
TOTAL FOR CLIENT1 1,379.76

17/07/2024 09:45     TYPE1      MILAN       732.56
17/07/2024 09:59     TYPE1      MILAN       462.37
Total for product : TYPE1 1,194.93
TOTALE FOR CLIENT2 1,194.93

By using the lines parser I can extract correctly all the single items with date, product type, location and price. However, I also need to extract the client name for each group (in the example above CLIENT1, CLIENT2...) and ideally add it to each line of the group, but I can't find how. Is there a way to do it by just using a template or do I need a custom plugin?

Thank you for your time

bosd commented 1 week ago

Thanks for your interest in invoice2data. The feature you are looking for is currently not implemented. It would would need to be developed. It is a reasonable use case. So if there is an pr opened for it I will accept it.