smart_importer with bean-extract patch will post to first posting account despite 'filing_account'

redstreet / beancount_reds_importers

Simple ingesting tools for Beancount (plain text, double entry accounting software). More importantly, a framework to allow you to easily write your own importers.

GNU General Public License v3.0

104 stars 37 forks source link

smart_importer with bean-extract patch will post to first posting account despite 'filing_account' #97

Closed scanta2 closed 5 months ago

scanta2 commented 5 months ago

Filing_account has no effect for either banking or investment importers. Using smart_importer together with the patched bean-extract leads to file names related to the first posting in each transaction entry, which is not necessarily the asset account.

scanta2 commented 5 months ago

In addition, investment accounts get their file name from the ticker account, and not from the parent account.

redstreet commented 5 months ago

In addition, investment accounts get their file name from the ticker account, and not from the parent account.

Probably unrelated, would you mind opening a separate bug for this?

redstreet commented 5 months ago

From here.

See this comment. This line should be made the default for all transaction builders.

scanta2 commented 5 months ago

@redstreet That's the temporary change I tried in my local repo, for banking and investment accounts. Would it be better to change the parent class instead?

redstreet commented 5 months ago

Correct. Patch coming in a few minutes.

redstreet commented 5 months ago

Does that fix work?

awtimmering commented 5 months ago

I stepped through the smart_importer code, and it seems to be just a fluke of the prediction (maybe in situations where input data is not very rich yet). I verified the training data does not contain any reversed postings and it outputs it nonethless as result of predictor.py#L234.

We could consider proposing a change to update_postings(transaction, accounts) (entries.py#L6) to honor the original order (i.e. the single posting in transaction to always come in the first place, other accounts to follow), wdyt?

redstreet commented 5 months ago

Good to know nothing in particular caused this. Looks like we (reds-importers) were depending on the ordering so far, that we shouldn't have.

In general, ordering of postings in the Beancount ecosystem is not guaranteed AFAIK, so we shouldn't depend on that, or ask others to, as it would be prone to breakage.

I also like the idea of printing out a sorted order (even though it makes no functional difference), which we can only do if we have a different way of marking the file to write the transaction to.

The above patch I committed yesterday should resolve the problem in a dependable way. Have you tried it?

awtimmering commented 5 months ago

In general, ordering of postings in the Beancount ecosystem is not guaranteed AFAIK, so we shouldn't depend on that, or ask others to, as it would be prone to breakage.

Agree. I think regardless of the fix on your side (which is good) it might be a good suggestion for smart_importers, so will suggest & add PR. There might be other considerations, but the current result ends up with postings that are hard to read, even if functionally not different, as the "remainder" entry can come in first place.

For example, an import for a bank transaction might end up like this:

2024-04-04 * "Bank payment A"
  Expenses:Groceries
  Assets:MyBank:Savings         -100 USD

The above patch I committed yesterday should resolve the problem in a dependable way. Have you tried it?

Sorry to say I haven't, as for a mix of reasons I am not using your importers (yet, anyway) - even though I have drawn a lot of inspiration from your posts and scripts! (My Japanese bank CSVs don't have headers, then my Japanese Amex CSV does have headers but has multiple sections for family cards that require some custom parsing etc).

scanta2 commented 5 months ago

Good to know nothing in particular caused this. Looks like we (reds-importers) were depending on the ordering so far, that we shouldn't have.

In general, ordering of postings in the Beancount ecosystem is not guaranteed AFAIK, so we shouldn't depend on that, or ask others to, as it would be prone to breakage.

I also like the idea of printing out a sorted order (even though it makes no functional difference), which we can only do if we have a different way of marking the file to write the transaction to.

The above patch I committed yesterday should resolve the problem in a dependable way. Have you tried it?

Thank you for the fix, it works well on my side, and it also fixes #98.

redstreet commented 5 months ago

Thank you for the fix, it works well on my side, and it also fixes #98.

Great! Thanks for reporting.

redstreet commented 5 months ago

Agree. I think regardless of the fix on your side (which is good) it might be a good suggestion for smart_importers, so will suggest & add PR. There might be other considerations, but the current result ends up with postings that are hard to read, even if functionally not different, as the "remainder" entry can come in first place.

+1. I too find a lot of value in friendly posting ordering. After all, a core benefit of plain text accounting is the journal being human readable. Thanks for filing this with smart_importer.

redstreet commented 5 months ago

Sorry to say I haven't, as for a mix of reasons I am not using your importers (yet, anyway) - even though I have drawn a lot of inspiration from your posts and scripts! (My Japanese bank CSVs don't have headers, then my Japanese Amex CSV does have headers but has multiple sections for family cards that require some custom parsing etc).

Glad to hear the articles helped inspire! The process of writing those articles actually helped me get my 5-minute update down to just a few seconds, reliably. So it's well worth it IMHO.

Regardless of whether you use my importers: if you don't have headers, petl makes it trivial to inject them. Takes just a line of code.

Does your Amex contain multiple tables? If so, check out the multitable reader in my importers, either for use or for inspiration. Also based on petl.

awtimmering commented 5 months ago

Added issue & PR at https://github.com/beancount/smart_importer/issues/128 .

Thank you for the comments - I've been looking at petl and am keen to try it out, so will give it a try, as well as take another look at your importers. (And haven't started yet really undertanding the zsh scripts -- never touched that shell so far but seems I should give it a try!)