redstreet / reds-ramblings-comments

0 stars 0 forks source link

personal-finance/transaction-builders/ #12

Open savingsandloan opened 2 years ago

savingsandloan commented 2 years ago

Transaction Builders — Red's Rants

Transaction builders specialize in putting together the set of postings for each entry.I’ve found three classes where this specialization is handy, discussed...

https://reds-rants.netlify.app/personal-finance/transaction-builders/

savingsandloan commented 2 years ago

Question - investments.py: How to approach support for multiple transfer accounts

Hi, I'm currently looking into migrating a fidelity CSV importer to fit the conventions of beancount_reds_importers (reason: I find it easier to download csv files than OFX and have it cover some other transaction cases)

One thing I've observed is that I have transfers between multiple other Fidelity accounts (e.g. between checking, brokerage, hsa) where seemingly the transfer config should be a dictionary akin to fund_info. Should I look into modifying transfer in investments.py or is there another option?

To have a rough example, in my current personal importer I have a dictionary for transfer accounts where the key is a piece of identifying data like:

'transfer_accounts' : {"X99999991": "Assets:Fidelity:Brokerage", "299999996": "Assets:HSA:Fidelity"},

The identifying data in this case is found via parsing the line in the csv file.

redstreet commented 2 years ago

Any transfer is going to show up on two accounts. I.e., there is going to be a transaction in your checking account showing a received transfer, and one more transaction in your brokerage account showing the sent transfer for the same transfer.

This needs to be deduped. Deduping transfer transactions is a common pattern.

So the answer to your question depends on how you dedupe. for myself, I find that rather than solve each transfer case in each account individually, usually with code, I simply have all the transfers point to an intermediate transfer account. This actually simplifies things, while buying me reconciliation (of both of the imported transfer transactions) at the same time. See these for more: Deduping Zerosum Zerosum README with example

If you do want to have custom transfer accounts like you showed above, then logic is needed. You can code that in generate_transfer_entry in investments.py. But I would suggest examining exactly what this complexity is buying you and whether it is worth it.

savingsandloan commented 2 years ago

Thanks for the clarifications so far. Will read your posts and continue learning about how you organized the code.

Good point on deduping. So far my deduping pattern has been generally assuming a convention where in most cases incoming transfers for a given account importer were genereated as commented out transactions by setting meta['__duplicate__'] = True before passing the meta dictionary into data.Transaction. This was a hardcoded convention in my personal importers, but have been starting to rethink things as I try to develop importers with your ideas and open-sourcing in mind.

But I would suggest examining exactly what this complexity is buying you and whether it is worth it.

Fair point. At the moment it's coming up out of just realizing that I end up with these transfers with multiple accounts, but will note this is only happening on one account importer at the moment so it may be better worth handling by hand / setting the transfer account as a stub like "asset:fidelity:transferTODO".

redstreet commented 2 years ago

Your approach of commenting out one side is one way to do it, but it runs into exactly what you ran into: determining a the target account is messy, based on a set of rules that are prone to breakage. Every time you open a new real world account, you would have to specify the new set of transfer rules for that account. Every time you make a type of transfer that you had not anticipated, you will have to update your rules. And so forth. Just some things I ran into repeatedly, that I'm mentioning for your consideration as you design a system that works for you.

If you simply set the transfer account on both sides to asset:fidelity:transferTODO, and don't comment out either side, they will neatly cancel each other out. Handling by hand won't be necessary with this approach.

savingsandloan commented 2 years ago

"Every time you open a new real world account, you would have to specify the new set of transfer rules for that account." Fair point, I've been thinking about this more and still keep coming back to:

With that in mind do you have any opposition to having an optional transfer_info argument for individual transfers in addition to the transfer argument serving as a default value? Or merge it all in one thing with a default_transfer key/value? i.e. one could have

transfer_info = {
    'transfer_accounts': ('X99999999',
    'Assets:Fidelity:Checking',
    TransferAccountDedupeStyle.COMMENT_INCOMING_TRANSACTIONS),
}  

and in libtransactionbuilder/common.py perhaps an enum for the dedupe style

    class TransferAccountDedupeStyle(Enum):
        COMMENT_INCOMING_TRANSACTIONS = 1
        COMMENT_OUTGOING_TRANSACTIONS = 2
        COMMENT_ALL_TRANSACTIONS = 3
        COMMENT_NO_TRANSACTIONS = 4

Again, this would be designed to be an optional complement to having a default transfer account that would get inserted without any configured dedupe handling.

Second question...

investments.py: How to approach support for bond purchases

The fidelity CSV file format includes some additional information that can be parsed for sake of having metadata for bond purchases - such as bond term yield, annual yield, expiration date, and duration. Is investments.py the appropriate place to modify how to include supporting these things, assuming a given account importer does the initial parsing & column preparation to include this data?

redstreet commented 2 years ago
  1. The functionality you are asking for already exists:

    • You can pass in arbitrary structures into the config dictionary
    • You can override get_target_acct_custom to determine the other side of the transfer (from which you can access self.config)
    • You can override skip_transactions if needed. I'd be open to a PR to a comment_out_transaction method if there is some clear value to it. IIRC, bean-extract will kick out commented transactions anyway, so I'm not sure if this is useful, though I might not be remembering correctly
  2. Agree, better metadata inclusion would be valuable. Metadata should ideally come from a build_meta() method that is overridable in individual importers. Modifications to all the transaction builders (investment, banking, paycheck) to support this would be welcome. I'd imagine it's just a couple lines.

Does that help?

redstreet commented 2 years ago

Before you get too deep into solving transfers, might I suggest considering that v3 approaches these way better, and solves a bunch of these problems neatly, and your efforts here might possibly not port over:

See the deduping article:

Deduping Transactions with Counterparts in Other Accounts [...] Beancount v3 has a proposal to render all this moot by allowing the two halves to be declared separately.

savingsandloan commented 2 years ago

Ah, thanks. Didn't know about get_target_acct_custom and will think about build_meta(). Still a ways to go in migrating my importer over.

redstreet commented 2 years ago

I've had a personal concern that having a zerosum account in my setup may make impact future visualization/charting of account transfers.

Btw, I'm curious about the above. Would you mind expanding in it so I can understand better?

redstreet commented 2 years ago

Still a ways to go in migrating my importer over.

The first one, of course, will involve the most effort. Subsequent ones should be far easier. If not, do let me know. Good luck!

savingsandloan commented 2 years ago

I've had a personal concern that having a zerosum account in my setup may make impact future visualization/charting of account transfers.

Btw, I'm curious about the above. Would you mind expanding in it so I can understand better?

Oh, part of why I've been attracted to dual-entry accounting is that in theory one could visualize a more accurate animated scrub-over-time of accounts - sortof how like things like https://gource.io/ work. Another example, though nothing to do with transactional entry might be https://www.chartfleau.com/ . For personal accounting, it may not be practical compared to a simple stacked bar chart but could be useful for more complex accounting (e.g. organizational) scenarios.

In such visualizations, I assume the 'pont' matters as much as the account points, hence my concern around having intermediary accounts to settle transfers.

savingsandloan commented 2 years ago

Another question (related more to bond purchases but could relate to other areas), in addition to generating bond-related metadata I had also generated commodity info. So, for a transaction like

2020-01-01 * " YOU BOUGHT -  UNITED STATES TREAS NTS NOTE 2.75000% 01/01/2020"
  bond-accrued-interest: 0.23
  bond-exp-date: 2022-01-01
  bond-coupon-rate: 2.75000
  bond-term-months: 24
  Assets:Fidelity:Brokerage:Cash             -998.66 USD
  Assets:Fidelity:Brokerage:USTBill:CUSIP9999999B2  1000 CUSIP9999999B2 {0.9985 USD}
  Equity:Fidelity:Brokerage:RoundingError        0.16 USD

It would also generate this:

2020-01-01 open Income:Fidelity:Brokerage:USTBill:CUSIP9999999B2
2020-01-01 open Assets:Fidelity:Brokerage:USTBill:CUSIP9999999B2
2020-01-01 commodity CUSIP9999999B2
  name: "US Treasury Note 2-year - CUSIP 9999999B2"
  other_meta_info: "asset type, asset_allocation etc."

Largely having the commodity auto-generated with more descriptive info in the name and other metadata saved me time from trying to manually figuring this out afterward. Any idea how to approach generating at least the commodity entry within a custom importer while still following the conventions of beancount_reds_importers? Feels like a similar question akin to needing a build_meta function.

savingsandloan commented 2 years ago

Just to answer my own question above, I realized we can just add something along the lines of includes_commodities and includes_accounts with a similar pattern to how extract_balances_and_prices is called depending on includes_balances.

Won't lead to items being grouped together in the file itself but that's okay.

redstreet commented 2 years ago

Missed seeing your question, sorry! Yes, absolutely, that's the way I would do it. Also:

savingsandloan commented 2 years ago

Hi, went quiet but been spending a bunch of time on this. I'm doing some more etl to resolve issues, and was wondering if there's an explanation of the conventions you have around the ot.type values and around how you see differences in amount/units/total.

Particularly around the lines in investments.py where for generate_transfer_entry there's:

        try:
            if ot.type in ['transfer']:
                units = ot.units
            elif ot.type in ['other', 'credit', 'debit', 'dep', 'cash']:
                units = ot.amount
            else:
                units = ot.total

Does this imply transfer are for asset-transfers only (i.e. not cash) then? What exactly does dep and cash types mean? (EDIT: okay realized dep means deposit and assuming cash is for cash transfers, just thrown off a bit from transfer implying around investment shares)

And what is the difference between amount and total? I know that generate_trade_entry uses total rather than amount.

redstreet commented 2 years ago

Does this imply transfer are for asset-transfers only (i.e. not cash) then?

Correct.

And what is the difference between amount and total? I know that generate_trade_entry uses total rather than amount.

I don't remember for sure off the top of my head. I believe total is primarily for transactions involving non-cash, and amount for those involving cash. The OFX spec is here. However, I've found that banks make deviations, at least in my reading. The code reflects what I've found to work well with most ofx files that I download. Do let me know if you see something amiss.

savingsandloan commented 2 years ago

Thanks, I'll roll with it for now. Only change I think I'm skating towards is breaking out the transaction mapping (rdr = rdr.convert('type', self.transaction_type_map) within convert_columns in csvreader.py) to another method (that can be overriden) as I'm reaching a point where I need more information that just what's in the type column to handle assigning the appropriate action.

redstreet commented 2 years ago

Sure, that sounds like a good idea. Feel free to send a PR.

redstreet commented 2 years ago

@savingsandloan writes: I tend to see vanguard transaction events where two dividends accumulate and sweep into a money market account.

i.e. let's say you have two dividend transactions:

<INCOME><INVTRAN>
<FITID>123888456<DTTRADE>20210301160000.000[-5:EST]<DTSETTLE>20210301160000.000[-5:EST]
<MEMO>DIVIDEND PAYMENTDIVIDEND PAYMENT</INVTRAN>
<SECID><UNIQUEID>92206C300<UNIQUEIDTYPE>CUSIP</SECID>
<INCOMETYPE>DIV<TOTAL>50.00<SUBACCTSEC>CASH<SUBACCTFUND>CASH</INCOME>
<INCOME><INVTRAN>
<FITID>123777456<DTTRADE>20210301160000.000[-5:EST]<DTSETTLE>20210301160000.000[-5:EST]
<MEMO>DIVIDEND PAYMENTDIVIDEND PAYMENT</INVTRAN>
<SECID><UNIQUEID>92206C821<UNIQUEIDTYPE>CUSIP</SECID>
<INCOMETYPE>DIV<TOTAL>100.00<SUBACCTSEC>CASH<SUBACCTFUND>CASH</INCOME>

That pool together to result in this purchase of a money-market asset:

<BUYTYPE>BUY</BUYMF><BUYMF><INVBUY><INVTRAN>
<FITID>123999456<DTTRADE>20210301160000.000[-5:EST]<DTSETTLE>20210301160000.000[-5:EST]
<MEMO>MONEY FUND PURCHASE</INVTRAN>
<SECID><UNIQUEID>922906300<UNIQUEIDTYPE>CUSIP</SECID>
<UNITS>150.00<UNITPRICE>1.0<TOTAL>-150.00<SUBACCTSEC>CASH<SUBACCTFUND>

Currently it results in:

2021-03-01 * "MONEY FUND PURCHASE" "[VMFXX] Vanguard Federal Money Market Fund"
  file_account: "Assets:Vanguard:Brokerage"
  Assets:Vanguard:Brokerage:VMFXX   150.00 VMFXX {1.0 USD}
  Assets:Vanguard:Brokerage:USD    -150.00 USD            

2021-03-01 * "DIVIDEND PAYMENTDIVIDEND PAYMENT" "[VSBSX] Vanguard Short-Term Treasury Index - Admiral Shares"
  Assets:Vanguard:Brokerage:USD               50.00 USD
  Income:Vanguard:Brokerage:VSBSX:Dividends  -50.00 USD

2021-03-01 * "DIVIDEND PAYMENTDIVIDEND PAYMENT" "[VLGSX] Vanguard Long-Term Treasury Index - Admiral Shares"
  Assets:Vanguard:Brokerage:USD               100.00 USD
  Income:Vanguard:Brokerage:VLGSX:Dividends  -100.00 USD

This is functional, but feels rather odd to have three transactions, and requires that there's an imaginary account of Assets:Vanguard:Brokerage:USD for everything to work. Is it possible for the importer to be a bit smart and glob together these events based on the fact that they all have the same timestamp (20210301160000.000)? the FITID's are also similar-ish (the first three and last three digits usually are all the same between these events as shown above). Ideally thinking the generated result should look like:

2021-03-01 * "MONEY FUND PURCHASE & DIVIDEND PAYMENTDIVIDEND PAYMENT & DIVIDEND PAYMENTDIVIDEND PAYMENT" "[VMFXX] Vanguard Federal Money Market Fund & [VSBSX] Vanguard Short-Term Treasury Index - Admiral Shares & [VLGSX] Vanguard Long-Term Treasury Index - Admiral Shares"
  file_account: "Assets:Vanguard:Brokerage"
  Assets:Vanguard:Brokerage:VMFXX   150.00 VMFXX {1.0 USD}
  Income:Vanguard:Brokerage:VSBSX:Dividends  -50.00 USD
  Income:Vanguard:Brokerage:VLGSX:Dividends  -100.00 USD

Feel free to close this idea if it sounds too risky/complicated, just trying to brainstorm anything that comes to mind.

redstreet commented 2 years ago

Hello again! Good question, this is definitely something I considered, and for a while had code to handle it. However, I found that I didn't get anything out of doing this. In general, this falls under the category of drawing "higher level" inferences based on a set of rules. I found this to break rather easily because:

there are always cases one hasn't run into yet and are therefore not encoded as rules in the code
institutions make surprising changes occasionally, which breaks code
rules vary across institutions, making it a pain to maintain these rules

The question I'd go back to is: is there a true benefit to drawing these inferences? The source looks a bit better, but I rarely look at my source or even journal for investments (I do, for expenses); my view of my transactions is either through BQL or fava, and there, I'm looking at aggregates and queries (for investments), which are all agnostic to how the source looks in these cases.

That said, having a post_process() api which calls a user-function at the end of extract() would allow each user to to write a few lines of code to maintain their own "inference rules" such as this one, if so desired, and I'd be very open to creating that (should be simple).