mincong-h / finance-toolkit

Finance Toolkit
3 stars 1 forks source link

Fix accountBalance format and file encoding #73

Closed mincong-h closed 2 years ago

mincong-h commented 2 years ago

Account Balance

TL;DR: format(amount)=FR, format(accountbalance)=ISO

Previously, the format of the column "accountbalance" was written in French format:

Now it is using the ISO format without thousands separator:

but the other column "amount" remains unchanged -- i.e. still in French format. To fix this, we consider the column "accountbalance" as string when reading the CSV file and cast it ourselves.

Encoding

Previously the encoding of the file is ISO-8859-1, it had been changed to UTF-8 now. When using ISO-8859-1, we can see the problem from this error, where dateOp was considered as dateOp:

finance_toolkit.pipeline.PipelineDataError: Failed to read new Boursorama data. Details:
  path=/data/source/export-operations-11-06-2022_09-52-55.csv
  headers=dateOp;dateVal;label;category;categoryParent;amount;comment;accountNum;accountLabel;accountbalance
  pandas_kwargs={'decimal': ',', 'delimiter': ';', 'dtype': {'accountNum': 'str'}, 'encoding': 'ISO-8859-1', 'parse_dates': ['dateOp', 'dateVal'], 'skipinitialspace': True, 'thousands': ' '}
  pandas_error=Missing column provided to 'parse_dates': 'dateOp'
codecov-commenter commented 2 years ago

Codecov Report

Merging #73 (4c360d8) into master (bc55bb8) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #73   +/-   ##
=======================================
  Coverage   94.23%   94.23%           
=======================================
  Files          11       11           
  Lines         607      607           
  Branches       97       97           
=======================================
  Hits          572      572           
  Misses         20       20           
  Partials       15       15           
Impacted Files Coverage Δ
finance_toolkit/boursorama.py 94.20% <100.00%> (ø)

:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

mincong-h commented 2 years ago

It's related to https://github.com/mincong-h/finance-toolkit/issues/40