Closed vapinv closed 1 year ago
Your invoice contains date 08/07/23
. Three values separated with /
.
You tell to try to parse it using %m-%d-%y
format which is made of 3 values separated with -
.
That hint clearly can't be used as suggested format doesn't match parsed value format. Fix separator in suggested format to match separator used in actual invoice.
Thank you for your assistance. Changing the hint does not fix the issue. I tried both '%m/%d/%y' and '%D' as seen below:
DEBUG:invoice2data.extract.invoice_template: END optimized_str ==========================
DEBUG:invoice2data.extract.invoice_template: Date parsing: languages=[] date_formats=['%m/%d/%y']
DEBUG:invoice2data.extract.invoice_template: Float parsing: decimal separator=[.]
DEBUG:invoice2data.extract.invoice_template: keywords=['Rents Stuff LLC']
DEBUG:invoice2data.extract.invoice_template: {'remove_whitespace': False, 'remove_accents': False, 'lowercase': False, 'currency': 'USD', 'date_formats': ['%m/%d/%y'], 'languages': [], 'decimal_separator': '.', 'replace': []}
DEBUG:invoice2data.extract.parsers.regex: field=amount | regex=Total:\s+(\d+.\d+\.\d+) | matches=['1,978.23']
DEBUG:invoice2data.extract.parsers.regex: field=invoice_number | regex=Invoice\s+Num\s+(SI\-\d+) | matches=['SI-68749']
DEBUG:invoice2data.extract.parsers.regex: field=date | regex=Invoice\s+Date:\s+(\d{2}\/\d{2}\/\d{2}) | matches=['08/07/23']
DEBUG:tzlocal: /etc/timezone found, contents:
America/Los_Angeles
DEBUG:tzlocal: /etc/localtime found
DEBUG:tzlocal: 2 found:
{'/etc/timezone': 'America/Los_Angeles', '/etc/localtime is a symlink to': 'America/Los_Angeles'}
DEBUG:invoice2data.extract.invoice_template: result of date parsing=2023-08-07 00:00:00
DEBUG:invoice2data.extract.invoice_template:
{ 'amount': 1978.23,
'currency': 'USD',
'date': datetime.datetime(2023, 8, 7, 0, 0),
'desc': 'Invoice from Rents',
'invoice_number': 'SI-68749',
'issuer': 'Rents'}
INFO:root: {'issuer': 'Rents', 'amount': 1978.23, 'invoice_number': 'SI-68749', 'date': datetime.datetime(2023, 8, 7, 0, 0), 'currency': 'USD', 'desc': 'Invoice from Rents'}
and
DEBUG:invoice2data.extract.invoice_template: Date parsing: languages=[] date_formats=['%D']
DEBUG:invoice2data.extract.invoice_template: Float parsing: decimal separator=[.]
DEBUG:invoice2data.extract.invoice_template: keywords=['Rents Stuff LLC']
DEBUG:invoice2data.extract.invoice_template: {'remove_whitespace': False, 'remove_accents': False, 'lowercase': False, 'currency': 'USD', 'date_formats': ['%D'], 'languages': [], 'decimal_separator': '.', 'replace': []}
DEBUG:invoice2data.extract.parsers.regex: field=amount | regex=Total:\s+(\d+.\d+\.\d+) | matches=['1,978.23']
DEBUG:invoice2data.extract.parsers.regex: field=invoice_number | regex=Invoice\s+Num\s+(SI\-\d+) | matches=['SI-68749']
DEBUG:invoice2data.extract.parsers.regex: field=date | regex=Invoice\s+Date:\s+(\d{2}\/\d{2}\/\d{2}) | matches=['08/07/23']
DEBUG:tzlocal: /etc/timezone found, contents:
America/Los_Angeles
DEBUG:tzlocal: /etc/localtime found
DEBUG:tzlocal: 2 found:
{'/etc/timezone': 'America/Los_Angeles', '/etc/localtime is a symlink to': 'America/Los_Angeles'}
DEBUG:invoice2data.extract.invoice_template: result of date parsing=2023-08-07 00:00:00
DEBUG:invoice2data.extract.invoice_template:
{ 'amount': 1978.23,
'currency': 'USD',
'date': datetime.datetime(2023, 8, 7, 0, 0),
'desc': 'Invoice from Rents',
'invoice_number': 'SI-68749',
'issuer': 'Rents'}
INFO:root: {'issuer': 'Rents', 'amount': 1978.23, 'invoice_number': 'SI-68749', 'date': datetime.datetime(2023, 8, 7, 0, 0), 'currency': 'USD', 'desc': 'Invoice from Rents'}
I'm using a custom template to extract data from an invoice. The template finds the correct date, but then a local timezone is found and the parser replaces the regex data and fails to format it.
I'm new to coding so I could very well have set the date_formats wrong, but everything I find online and in the template folders seems to indicate it is correct. I tried adding the parser and type to the date field, but it still didn't work.
I haven't setup a python script yet, this is me just trying to ensure everything works by running it through a bash terminal first. I've checked a ton of documentation on dateutils, utils, pyty, tz, and dateparser and I'm no closer to solving this on my own. Any assistance on fixing this would be greatly appreciated.
My template:
Results of debugging: