tarioch / beancounttools

Beancount Tools
MIT License
79 stars 23 forks source link

Nordigen importer + smart importer give duplicate asset postings #101

Closed jbrok closed 1 year ago

jbrok commented 1 year ago

Hi, first of all, thanks for providing this great package. I've been moving all the families' small accounts to the Nordigen API and was able to build some additional scripts and functionality to automatically generate new links when the account link expires and extend the max_historical_days to >90 days so I'm not in trouble when I forget to import data every 90 days. This proves to be great.

However, there's a problem that I'm struggling to debug. Specifically, about 30% of the Nordigen transactions (this issue doesn't occur with csvs, xls, etc.) end up with three postings when using smart_importer. Here's an example:

2023-08-19 * "amazon.co.uk"
  creditorName: "Amazon.co.uk*1f37b5qz4"
  nordref: "64e135f0-75fa-XXXX-XXXXXX-XXXXXX"
  Expenses:Shopping
  Assets:Person1:Bank:Revolut:GBP <--- Randomly added
  Assets:Person2:Bank:Revolut:GBP   -5.99 GBP

It always seems to add an extra random Asset: posting. After researching a while ago I stumbled upon an smart_import caching issue but that issue was fixed.

My importer looks like this:

# Nordigen API
apply_hooks(nordigen.Importer(), [categories, PredictPostings(), DuplicateDetector(comparator=ReferenceDuplicatesComparator('nordref'), window_days=10)])

Removing PredictPostings() from here gives me the right results.

I call bean-extract like this:

# filter only .yaml files to debug the Nordigen issue
bean-extract config.py ./import-files/*.yaml -e main.beancount > tmp.beancount && code tmp.beancount 

For the last months, I've been removing the extra postings with a regex find&replace but recently I found out it also impacts deduplication so it doesn't duplicate those transactions. Not sure if it's because of how the API calls are made or if it's a smart_imported issue. I'm importing 5 accounts/references through the Nordigen importer.

Any ideas that can point me in the right direction to a solution? Much appreciated!

tarioch commented 1 year ago

Hi @jbrok I'm using this importer also together with smart-importer and haven't noticed this behavior at all. Is it possible that you have actually real transactions with this Person1 account in the mix as well and smart-importer now auto suggests this? Smart importer might suggest more than one other accounts if that happens on its training data. I'm pretty sure it has nothing to do with the nordigen importer, either it's "caused" by the training data or there is something weird going on in smart-importer. Might be worth for you to look at some of the "old" transactions if it could pick it up from there, otherwise I would suggest to raise this against the smart-importer project.