invoice-x / invoice2data

Extract structured data from PDF invoices
MIT License
1.8k stars 476 forks source link

Added parsing for vat_rate and address fields. Added static partner name field #556

Closed esteve closed 4 months ago

esteve commented 4 months ago

This PR adds fields for VAT rate, address and partner name for Pepephone invoices.

esteve commented 4 months ago

@bosd pdftotext extracts it all in one line like this:

PEPEMOBILE S.L.Registro Mercantil de Madrid, Tomo 24.019, Libro 0, Folio 108, Sección 8, Hoja M-431409 e Inscripción 10ª. CIF: ESB85033470 Avda de Bruselas, num. 38,  
28108 Alcobendas, Madrid. España

I've used the CIF to find that line, but the capture group only takes the address.

bosd commented 4 months ago

Thanks, now I understand