Closed AlexTheWizardL closed 5 months ago
Thanks for reporting.
Here's the same example for direct copy-pasting into the python console:
from io import StringIO
import pandas as pd
from cashctrl_ledger import nest
csv = """
date,account,counter_account,currency,amount,base_currency_amount,vat_code,text,document
2024-05-27,1020,1000,CHF,2.0,,VAT 2.6%,Single 1,
2024-05-27,1020,1000,CHF,20.0,,VAT 2.6%,single2,
2024-05-28,1000,,CHF,-5.0,,Test_VAT_code,,
,1020,,,5.0,,Test_VAT_code,,
,1000,,,-5.0,,Test_VAT_code,,
,1020,,,5.0,,Test_VAT_code,,
,1020,,,-10.0,,Test_VAT_code,collective333,
,1000,,,10.0,,Test_VAT_code,collective4444,
"""
df = pd.read_csv(StringIO(csv), skipinitialspace=True)
nested = nest(df, columns=[col for col in df.columns if col != 'date'], key='txn')
print(nested)
>>> ## date txn
>>> ## 0 2024-05-27 account counter_account currency amount ...
>>> ## 1 2024-05-28 account counter_account currency amount ...
Dropping NA grouping values is the default behaviour of df.groupby()
, which does most of the actual work inside nest. We need to change to df.groupby(..., dropna=False)
.
See https://stackoverflow.com/questions/18429491/pandas-groupby-columns-with-nan-missing-values
The
nest
function doesn`t recognize NA as a unique value and drops itDF example
Code usage
it will drop all rows that have Date as NA but should identify those and create a nested df for those