Open savingsandloan opened 2 years ago
Hi, I'm currently looking into migrating a fidelity CSV importer to fit the conventions of beancount_reds_importers (reason: I find it easier to download csv files than OFX and have it cover some other transaction cases)
One thing I've observed is that I have transfers between multiple other Fidelity accounts (e.g. between checking, brokerage, hsa) where seemingly the transfer
config should be a dictionary akin to fund_info
. Should I look into modifying transfer
in investments.py or is there another option?
To have a rough example, in my current personal importer I have a dictionary for transfer accounts where the key is a piece of identifying data like:
'transfer_accounts' : {"X99999991": "Assets:Fidelity:Brokerage", "299999996": "Assets:HSA:Fidelity"},
The identifying data in this case is found via parsing the line in the csv file.
Any transfer is going to show up on two accounts. I.e., there is going to be a transaction in your checking account showing a received transfer, and one more transaction in your brokerage account showing the sent transfer for the same transfer.
This needs to be deduped. Deduping transfer transactions is a common pattern.
So the answer to your question depends on how you dedupe. for myself, I find that rather than solve each transfer case in each account individually, usually with code, I simply have all the transfers point to an intermediate transfer account. This actually simplifies things, while buying me reconciliation (of both of the imported transfer transactions) at the same time. See these for more: Deduping Zerosum Zerosum README with example
If you do want to have custom transfer accounts like you showed above, then logic is needed. You can code that in generate_transfer_entry
in investments.py
. But I would suggest examining exactly what this complexity is buying you and whether it is worth it.
Thanks for the clarifications so far. Will read your posts and continue learning about how you organized the code.
Good point on deduping. So far my deduping pattern has been generally assuming a convention where in most cases incoming transfers for a given account importer were genereated as commented out transactions by setting meta['__duplicate__'] = True
before passing the meta dictionary into data.Transaction
. This was a hardcoded convention in my personal importers, but have been starting to rethink things as I try to develop importers with your ideas and open-sourcing in mind.
But I would suggest examining exactly what this complexity is buying you and whether it is worth it.
Fair point. At the moment it's coming up out of just realizing that I end up with these transfers with multiple accounts, but will note this is only happening on one account importer at the moment so it may be better worth handling by hand / setting the transfer account as a stub like "asset:fidelity:transferTODO".
Your approach of commenting out one side is one way to do it, but it runs into exactly what you ran into: determining a the target account is messy, based on a set of rules that are prone to breakage. Every time you open a new real world account, you would have to specify the new set of transfer rules for that account. Every time you make a type of transfer that you had not anticipated, you will have to update your rules. And so forth. Just some things I ran into repeatedly, that I'm mentioning for your consideration as you design a system that works for you.
If you simply set the transfer account on both sides to asset:fidelity:transferTODO, and don't comment out either side, they will neatly cancel each other out. Handling by hand won't be necessary with this approach.
"Every time you open a new real world account, you would have to specify the new set of transfer rules for that account." Fair point, I've been thinking about this more and still keep coming back to:
fund_info
argument every time they invest in a new asset.With that in mind do you have any opposition to having an optional transfer_info
argument for individual transfers in addition to the transfer
argument serving as a default value? Or merge it all in one thing with a default_transfer
key/value? i.e. one could have
transfer_info = {
'transfer_accounts': ('X99999999',
'Assets:Fidelity:Checking',
TransferAccountDedupeStyle.COMMENT_INCOMING_TRANSACTIONS),
}
and in libtransactionbuilder/common.py perhaps an enum for the dedupe style
class TransferAccountDedupeStyle(Enum):
COMMENT_INCOMING_TRANSACTIONS = 1
COMMENT_OUTGOING_TRANSACTIONS = 2
COMMENT_ALL_TRANSACTIONS = 3
COMMENT_NO_TRANSACTIONS = 4
Again, this would be designed to be an optional complement to having a default transfer account that would get inserted without any configured dedupe handling.
Second question...
The fidelity CSV file format includes some additional information that can be parsed for sake of having metadata for bond purchases - such as bond term yield, annual yield, expiration date, and duration. Is investments.py the appropriate place to modify how to include supporting these things, assuming a given account importer does the initial parsing & column preparation to include this data?
The functionality you are asking for already exists:
config
dictionaryget_target_acct_custom
to determine the other side of the transfer (from which you can access self.config
)skip_transactions
if needed. I'd be open to a PR to a comment_out_transaction
method if there is some clear value to it. IIRC, bean-extract will kick out commented transactions anyway, so I'm not sure if this is useful, though I might not be remembering correctlyAgree, better metadata inclusion would be valuable. Metadata should ideally come from a build_meta()
method that is overridable in individual importers. Modifications to all the transaction builders (investment, banking, paycheck) to support this would be welcome. I'd imagine it's just a couple lines.
Does that help?
Before you get too deep into solving transfers, might I suggest considering that v3 approaches these way better, and solves a bunch of these problems neatly, and your efforts here might possibly not port over:
See the deduping article:
Deduping Transactions with Counterparts in Other Accounts [...] Beancount v3 has a proposal to render all this moot by allowing the two halves to be declared separately.
Ah, thanks. Didn't know about get_target_acct_custom
and will think about build_meta()
. Still a ways to go in migrating my importer over.
I've had a personal concern that having a zerosum account in my setup may make impact future visualization/charting of account transfers.
Btw, I'm curious about the above. Would you mind expanding in it so I can understand better?
Still a ways to go in migrating my importer over.
The first one, of course, will involve the most effort. Subsequent ones should be far easier. If not, do let me know. Good luck!
I've had a personal concern that having a zerosum account in my setup may make impact future visualization/charting of account transfers.
Btw, I'm curious about the above. Would you mind expanding in it so I can understand better?
Oh, part of why I've been attracted to dual-entry accounting is that in theory one could visualize a more accurate animated scrub-over-time of accounts - sortof how like things like https://gource.io/ work. Another example, though nothing to do with transactional entry might be https://www.chartfleau.com/ . For personal accounting, it may not be practical compared to a simple stacked bar chart but could be useful for more complex accounting (e.g. organizational) scenarios.
In such visualizations, I assume the 'pont' matters as much as the account points, hence my concern around having intermediary accounts to settle transfers.
Another question (related more to bond purchases but could relate to other areas), in addition to generating bond-related metadata I had also generated commodity info. So, for a transaction like
2020-01-01 * " YOU BOUGHT - UNITED STATES TREAS NTS NOTE 2.75000% 01/01/2020"
bond-accrued-interest: 0.23
bond-exp-date: 2022-01-01
bond-coupon-rate: 2.75000
bond-term-months: 24
Assets:Fidelity:Brokerage:Cash -998.66 USD
Assets:Fidelity:Brokerage:USTBill:CUSIP9999999B2 1000 CUSIP9999999B2 {0.9985 USD}
Equity:Fidelity:Brokerage:RoundingError 0.16 USD
It would also generate this:
2020-01-01 open Income:Fidelity:Brokerage:USTBill:CUSIP9999999B2
2020-01-01 open Assets:Fidelity:Brokerage:USTBill:CUSIP9999999B2
2020-01-01 commodity CUSIP9999999B2
name: "US Treasury Note 2-year - CUSIP 9999999B2"
other_meta_info: "asset type, asset_allocation etc."
Largely having the commodity auto-generated with more descriptive info in the name and other metadata saved me time from trying to manually figuring this out afterward. Any idea how to approach generating at least the commodity entry within a custom importer while still following the conventions of beancount_reds_importers? Feels like a similar question akin to needing a build_meta
function.
Just to answer my own question above, I realized we can just add something along the lines of includes_commodities
and includes_accounts
with a similar pattern to how extract_balances_and_prices
is called depending on includes_balances
.
Won't lead to items being grouped together in the file itself but that's okay.
Missed seeing your question, sorry! Yes, absolutely, that's the way I would do it. Also:
bean-extract
, which you can use to determine whether or not the account and commodity already exist or need to be created.price
entriesHi, went quiet but been spending a bunch of time on this. I'm doing some more etl to resolve issues, and was wondering if there's an explanation of the conventions you have around the ot.type values and around how you see differences in amount/units/total.
Particularly around the lines in investments.py
where for generate_transfer_entry
there's:
try:
if ot.type in ['transfer']:
units = ot.units
elif ot.type in ['other', 'credit', 'debit', 'dep', 'cash']:
units = ot.amount
else:
units = ot.total
Does this imply transfer
are for asset-transfers only (i.e. not cash) then? What exactly does dep
and cash
types mean? (EDIT: okay realized dep
means deposit and assuming cash
is for cash transfers, just thrown off a bit from transfer
implying around investment shares)
And what is the difference between amount
and total
? I know that generate_trade_entry
uses total
rather than amount
.
Does this imply transfer are for asset-transfers only (i.e. not cash) then?
Correct.
And what is the difference between amount and total? I know that generate_trade_entry uses total rather than amount.
I don't remember for sure off the top of my head. I believe total
is primarily for transactions involving non-cash, and amount
for those involving cash. The OFX spec is here. However, I've found that banks make deviations, at least in my reading. The code reflects what I've found to work well with most ofx files that I download. Do let me know if you see something amiss.
Thanks, I'll roll with it for now. Only change I think I'm skating towards is breaking out the transaction mapping (rdr = rdr.convert('type', self.transaction_type_map)
within convert_columns
in csvreader.py
) to another method (that can be overriden) as I'm reaching a point where I need more information that just what's in the type column to handle assigning the appropriate action.
Sure, that sounds like a good idea. Feel free to send a PR.
@savingsandloan writes: I tend to see vanguard transaction events where two dividends accumulate and sweep into a money market account.
i.e. let's say you have two dividend transactions:
<INCOME><INVTRAN>
<FITID>123888456<DTTRADE>20210301160000.000[-5:EST]<DTSETTLE>20210301160000.000[-5:EST]
<MEMO>DIVIDEND PAYMENTDIVIDEND PAYMENT</INVTRAN>
<SECID><UNIQUEID>92206C300<UNIQUEIDTYPE>CUSIP</SECID>
<INCOMETYPE>DIV<TOTAL>50.00<SUBACCTSEC>CASH<SUBACCTFUND>CASH</INCOME>
<INCOME><INVTRAN>
<FITID>123777456<DTTRADE>20210301160000.000[-5:EST]<DTSETTLE>20210301160000.000[-5:EST]
<MEMO>DIVIDEND PAYMENTDIVIDEND PAYMENT</INVTRAN>
<SECID><UNIQUEID>92206C821<UNIQUEIDTYPE>CUSIP</SECID>
<INCOMETYPE>DIV<TOTAL>100.00<SUBACCTSEC>CASH<SUBACCTFUND>CASH</INCOME>
That pool together to result in this purchase of a money-market asset:
<BUYTYPE>BUY</BUYMF><BUYMF><INVBUY><INVTRAN>
<FITID>123999456<DTTRADE>20210301160000.000[-5:EST]<DTSETTLE>20210301160000.000[-5:EST]
<MEMO>MONEY FUND PURCHASE</INVTRAN>
<SECID><UNIQUEID>922906300<UNIQUEIDTYPE>CUSIP</SECID>
<UNITS>150.00<UNITPRICE>1.0<TOTAL>-150.00<SUBACCTSEC>CASH<SUBACCTFUND>
Currently it results in:
2021-03-01 * "MONEY FUND PURCHASE" "[VMFXX] Vanguard Federal Money Market Fund"
file_account: "Assets:Vanguard:Brokerage"
Assets:Vanguard:Brokerage:VMFXX 150.00 VMFXX {1.0 USD}
Assets:Vanguard:Brokerage:USD -150.00 USD
2021-03-01 * "DIVIDEND PAYMENTDIVIDEND PAYMENT" "[VSBSX] Vanguard Short-Term Treasury Index - Admiral Shares"
Assets:Vanguard:Brokerage:USD 50.00 USD
Income:Vanguard:Brokerage:VSBSX:Dividends -50.00 USD
2021-03-01 * "DIVIDEND PAYMENTDIVIDEND PAYMENT" "[VLGSX] Vanguard Long-Term Treasury Index - Admiral Shares"
Assets:Vanguard:Brokerage:USD 100.00 USD
Income:Vanguard:Brokerage:VLGSX:Dividends -100.00 USD
This is functional, but feels rather odd to have three transactions, and requires that there's an imaginary account of Assets:Vanguard:Brokerage:USD
for everything to work. Is it possible for the importer to be a bit smart and glob together these events based on the fact that they all have the same timestamp (20210301160000.000)? the FITID's are also similar-ish (the first three and last three digits usually are all the same between these events as shown above). Ideally thinking the generated result should look like:
2021-03-01 * "MONEY FUND PURCHASE & DIVIDEND PAYMENTDIVIDEND PAYMENT & DIVIDEND PAYMENTDIVIDEND PAYMENT" "[VMFXX] Vanguard Federal Money Market Fund & [VSBSX] Vanguard Short-Term Treasury Index - Admiral Shares & [VLGSX] Vanguard Long-Term Treasury Index - Admiral Shares"
file_account: "Assets:Vanguard:Brokerage"
Assets:Vanguard:Brokerage:VMFXX 150.00 VMFXX {1.0 USD}
Income:Vanguard:Brokerage:VSBSX:Dividends -50.00 USD
Income:Vanguard:Brokerage:VLGSX:Dividends -100.00 USD
Feel free to close this idea if it sounds too risky/complicated, just trying to brainstorm anything that comes to mind.
Hello again! Good question, this is definitely something I considered, and for a while had code to handle it. However, I found that I didn't get anything out of doing this. In general, this falls under the category of drawing "higher level" inferences based on a set of rules. I found this to break rather easily because:
there are always cases one hasn't run into yet and are therefore not encoded as rules in the code
institutions make surprising changes occasionally, which breaks code
rules vary across institutions, making it a pain to maintain these rules
The question I'd go back to is: is there a true benefit to drawing these inferences? The source looks a bit better, but I rarely look at my source or even journal for investments (I do, for expenses); my view of my transactions is either through BQL or fava, and there, I'm looking at aggregates and queries (for investments), which are all agnostic to how the source looks in these cases.
That said, having a post_process() api which calls a user-function at the end of extract() would allow each user to to write a few lines of code to maintain their own "inference rules" such as this one, if so desired, and I'd be very open to creating that (should be simple).
Transaction Builders — Red's Rants
Transaction builders specialize in putting together the set of postings for each entry.I’ve found three classes where this specialization is handy, discussed...
https://reds-rants.netlify.app/personal-finance/transaction-builders/