The process_transactions() and the extract_summary_values_and_confidence() functions contain logic to convert string values to numbers if they are amounts.
Statement models trained from the 4/4/24 set the value within Custom Extraction Models themselves, which means this logic may no longer be required.
However, it has to be checked that the values are indeed written as the correct data type without this logic. The way in which the extracted data is returned by analyse_document() may need to be reviewed as well.
If the logic is indeed redundant, the Amex model will need to be retrained with the correct data types set on the labels.
The
process_transactions()
and theextract_summary_values_and_confidence()
functions contain logic to convert string values to numbers if they are amounts.https://github.com/emdeh/pdf-document-processor/blob/cb5414a78a2193739ad979a60229cb0f8fb3e90e/src/csv_utils.py#L44-L53
https://github.com/emdeh/pdf-document-processor/blob/cb5414a78a2193739ad979a60229cb0f8fb3e90e/src/csv_utils.py#L97-L111
Statement models trained from the 4/4/24 set the value within Custom Extraction Models themselves, which means this logic may no longer be required.
However, it has to be checked that the values are indeed written as the correct data type without this logic. The way in which the extracted data is returned by
analyse_document()
may need to be reviewed as well.If the logic is indeed redundant, the Amex model will need to be retrained with the correct data types set on the labels.