jbms / beancount-import

Web UI for semi-automatically importing external data into beancount
GNU General Public License v2.0
392 stars 101 forks source link

beancount_import matching broken for `securities_and_cash` ofx account? #113

Open falsifian opened 3 years ago

falsifian commented 3 years ago

I just tried beancount-import with an investment account for the first time, and I'm having trouble getting it to match any of my existing transactions.

After some fiddling, I llearned that if I take a transaction generated by beancount-import and manually delete the metadata, beancount-import won't realize it's the same transaction, and will try to generate it again.

(Some numbers below are replaced with XXX for privacy.)

As an example, with the below stripped-down beancount file and run_beancount_import.py, beancount-import generates the following new directives:

2020-12-30 * "SELLSTOCK - BANK MONTREAL QUEBEC"
  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
    date: 2020-12-30
    ofx_fitid: "XXX"
    ofx_memo: "BANK MONTREAL QUEBEC"
    ofx_type: "SELLSTOCK"
  Income:Capital-gains:BMO
  Assets:Brokerage:Cash     988.37 USD
    ofx_fitid: "XXX"
  Expenses:Fees               0.02 USD

2020-12-30 open Assets:Brokerage:BMO                            BMO

2020-12-30 open Income:Capital-gains:BMO                        USD

2020-12-30 open Assets:Brokerage:Cash                           USD

If I edit that transaction so that it instead reads:

2020-12-30 * "SELLSTOCK - BANK MONTREAL QUEBEC"
  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
  Income:Capital-gains:BMO
  Assets:Brokerage:Cash     988.37 USD
  Expenses:Fees               0.02 USD

then suddenly beancount-import won't match it any more, and wants to add another copy of the transaction. I was not able to get into a situation where beancount-import is willing to augment a transaction I've already entered by adding in the appropriate metadata.

Here is my main.beancount with account ID censored:

2000-01-01 open Income:Capital-gains
2000-01-01 open Income:Dividends
2000-01-01 open Income:Interest
2000-01-01 open Expenses:Fees

1792-04-02 commodity USD
  cusip: "9999101"

2000-01-01 open Assets:Brokerage
  ofx_org: ""
  ofx_broker_id: "Wells Fargo Advisors"
  ofx_account_type: "securities_and_cash"
  account_id: "XXX"
  capital_gains_account: Income:Capital-gains
  fees_account: Expenses:Fees
  div_income_account: Income:Dividends
  interest_income_account: Income:Interest

run_beancount_import.py contains:

import beancount_import.webserver

def run_reconcile():
    data_sources = [
        {
            "module": "beancount_import.source.ofx",
            "ofx_filenames": ("export.ofx",)
        },
    ]

    beancount_import.webserver.main(
        argv = (),
        journal_input = "main.beancount",
        ignored_journal = "main.beancount",
        default_output = "main.beancount",
        open_account_output = "main.beancount",
        balance_account_output = "main.beancount",
        data_sources = data_sources,
    )

if __name__ == "__main__":
    run_reconcile()

If any details from export.ofx would be useful, let me know.

falsifian commented 3 years ago

I should mention: I've been using beancount-import on my bank accounts without trouble for a while now. Matching seems to work fine.

m-d-brown commented 3 years ago

I'm also experiencing what I believe you're describing. When I import transactions that look like

2021-01-01 * "TRANSFER - PRETAX - Investment Expense"
  Assets:Vanguard:401k:PreTax:VGI007        -0.01 VGI007 {} @ 100.00 USD
    date: 2021-01-01
    ofx_fitid: "941930491093409139410394f"
    ofx_memo: "Investment Expense"
    ofx_type: "TRANSFER"
  Income:CapitalGains:Vanguard:401k:VGI007
  Expenses:Fees:Vanguard:401k                    1.00 USD

from the OFX importer and restart beancount-import, the same duplicate transaction is produced again, with an identical ofx_ftid, memo and type.

jbms commented 3 years ago

I don't think pull request #113 addresses this same issue.

The issue here is due to how matching.py (independent of OFX) works: currently matching is done based on posting weight, but as matching of the pending transaction is done without booking, the weight is unknown for these two postings:

  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
  Income:Capital-gains:BMO

Consequently, those postings don't participate in matching at all at the moment. Additionally, the remaining postings:

  Assets:Brokerage:Cash     988.37 USD
  Expenses:Fees               0.02 USD

don't balance (since we are missing the weight from the unknown-weight postings), and matching currently attempts to find a set of match groups that balance each currency.

To make this work the following improvement would be needed:

I'm looking into implementing that.

jbms commented 3 years ago

I pushed out a commit that solves part of the problem here.

However, it still does not actually make this case work. The problem is that matching is done against "partially booked" transactions, which still don't include resolved costs. Using fully-booked transactions is problematic because then the posting might actually have multiple lots and we will end up with multiple postings. Instead, I think the solution is to modify journal_editor to store the weight of each posting in the journal, which is already computed by booking, I believe. Then we can use that weight in matching.py, rather than deriving the weight from the posting itself (and not being able to determine the weight if the cost is not explicit).

m-d-brown commented 3 years ago

I've dereferenced PR #126 from this issue. Thanks for the explanations and I'll have to read more to better understand how matching works. Hopefully PR #126 is still work submitting on its own.