mincong-h / finance-toolkit

Finance Toolkit
3 stars 1 forks source link

BNP: ValueError: 6 columns passed, passed data had 9 columns #92

Closed mincong-h closed 1 year ago

mincong-h commented 1 year ago

When trying the move the data of BNP Paribas, it failed on the "Livret Dév. Durable et Solidaire" account with error:

Traceback (most recent call last):
  File "/usr/local/bin/finance-toolkit", line 33, in <module>
    sys.exit(load_entry_point('finance-toolkit==0.1.0', 'console_scripts', 'finance-toolkit')())
  File "/usr/local/lib/python3.7/site-packages/finance_toolkit-0.1.0-py3.7.egg/finance_toolkit/__main__.py", line 71, in main
  File "/usr/local/lib/python3.7/site-packages/finance_toolkit-0.1.0-py3.7.egg/finance_toolkit/tx.py", line 240, in move
  File "/usr/local/lib/python3.7/site-packages/finance_toolkit-0.1.0-py3.7.egg/finance_toolkit/pipeline.py", line 30, in run
  File "/usr/local/lib/python3.7/site-packages/finance_toolkit-0.1.0-py3.7.egg/finance_toolkit/bnp.py", line 112, in read_new_transactions
  File "/usr/local/lib/python3.7/site-packages/finance_toolkit-0.1.0-py3.7.egg/finance_toolkit/bnp.py", line 57, in read_raw
  File "/usr/local/lib/python3.7/site-packages/pandas-1.2.0-py3.7-linux-x86_64.egg/pandas/core/frame.py", line 1855, in from_records
    arrays, arr_columns = to_arrays(data, columns, coerce_float=coerce_float)
  File "/usr/local/lib/python3.7/site-packages/pandas-1.2.0-py3.7-linux-x86_64.egg/pandas/core/internals/construction.py", line 528, in to_arrays
    return _list_to_arrays(data, columns, coerce_float=coerce_float, dtype=dtype)
  File "/usr/local/lib/python3.7/site-packages/pandas-1.2.0-py3.7-linux-x86_64.egg/pandas/core/internals/construction.py", line 571, in _list_to_arrays
    raise ValueError(e) from e
ValueError: 6 columns passed, passed data had 9 columns

After some analysis with @jingwen-z , it happens when we

  1. Go to download my operations
  2. Select the account ID of "Livret Dév. Durable et Solidaire"
  3. Download
  4. Download request is stuck (even after 10s or longer)
  5. Use cmd+click to trigger a new request
  6. The new request is successful --> but the header changed

Below are different headers we see face to repetitive download requests (1, 2, 3 times):

"Livret D&eacute;v. Durable et Solidaire";"Livret D&amp;eacute;v. Durable et Solidaire";
"Livret D&amp;amp;eacute;v. Durable et Solidaire";"Livret D&amp;amp;amp;eacute;v. Durable et Solidaire";...
"Livret D&amp;amp;amp;amp;eacute;v. Durable et Solidaire";"Livret D&amp;amp;amp;amp;amp;eacute;v. Durable et Solidaire";

It's stateful: BNP escapes again (and again) for new requests in the same session 🤣 To mitigate this, we can 1) wait longer ... or disconnect the session and start again? 2) repeat the unescape operation we do here in bnp.py N times until the operation becomes no-op, i.e. the s == unescape(s): https://github.com/mincong-h/finance-toolkit/blob/50367adef4d22db319de328521e168c3b9e4b329/finance_toolkit/bnp.py#L46

mincong-h commented 1 year ago

OK I modified the header locally.