jbms / beancount-import

Web UI for semi-automatically importing external data into beancount
GNU General Public License v2.0
395 stars 103 forks source link

How does deduplication work? #20

Closed bobobo1618 closed 5 years ago

bobobo1618 commented 5 years ago

I have a simple issue. In my transactions.beancount (which is included by by journal), I have this:

2018-07-26 * "Narration"
  Assets:Cash:Bank1                               -1547.95 USD
  Assets:Cash:Bank2                                1547.95 USD

In my importer's output, I have this:

2018-07-26 * "Payee" "Narration"
  txn_id: "ID" 
  Assets:Cash:Bank1                               -1547.95 USD 
    source_desc: "Other bank name" 
  Expenses:FIXME                                   1547.95 USD

However beancount-import's suggestion doesn't come up with the duplicate.

I had a look around the code to see if I could figure out how to fix it myself but I couldn't find where it's implemented. How can I debug this?

bobobo1618 commented 5 years ago

Possibly nevermind, I found https://github.com/jbms/beancount-import/blob/0456402e58d526ea06b8f95d0fc0b70a1680f0a3/beancount_import/matching.py though the fuzzy_match_days flag. I'll close for now and reopen if I can't debug my problem.

bobobo1618 commented 5 years ago

For the record, it appears the problem was the "cleared" status of the transactions.

I hadn't read up on what exactly this meant and just added the method because I thought it couldn't hurt.

addisonklinke commented 1 year ago

@bobobo1618 could you explain what you mean by the "cleared" status of the transaction - why does beancount-import not realize that the candidate transaction generated by the importer already exists in the journal?