beancount / beangulp

Importers framework for Beancount
GNU General Public License v2.0
59 stars 23 forks source link

iconfig in CSV Importer categorizer call #103

Open Jorge1o1 opened 2 years ago

Jorge1o1 commented 2 years ago

In dd50f53 the row argument was added to the CSV Importer's categorizer function. With this second argument, it became very easy to add extra Postings onto an already created Transaction.

However, this way of adding Postings is dependent on knowing the indices of columns in advance. You have to know that the "payee" column is row[2], etc. This is especially problematic if the CSV files being imported don't have columns in a deterministic order (e.g. if they're created by another Python script) or if your bank/financial institution suddenly adds or removes a column.

With this PR, the iconfig dictionary that maps column type -> index would be included as a third parameter.

def categorizer(txn, row, iconfig):
    txn = txn._replace(payee=row[iconfig[Col.PAYEE]])
    txn.meta['source'] = pformat(row)
    return txn

Caveat: workarounds exist I know that we can use partial functions or callable class instances to bind the iconfig dictionary to our categorizer callable ahead of time. But it chafes me to have to open the file twice, normalize_config twice, etc.

dnicolodi commented 2 years ago

This CSV importer was born as an example with minimal functionality and has slowly grown into a piece of code that has more options than lines of code. It is kept in beangulp only to ease upgrade from the old importers framework. It will definitely go away after the first release of beangulp. Users should look into the new beangulp.csvbase.Importer which implements something similar with better interfaces or use petl to read and manipulate the CSV and beangulp.petl_utils to turn a petl table into a list of directives. With the intention of directing users to better alternatives, I am not very keen in adding yet another kludge to this CSV importer.