simonmichael / hledger

Robust, fast, intuitive plain text accounting tool with CLI, TUI and web interfaces.
https://hledger.org
GNU General Public License v3.0
2.85k stars 307 forks source link

csv: use existing journal entries as a source of implicit rules #2172

Open simonmichael opened 4 months ago

simonmichael commented 4 months ago

As discussed at https://www.reddit.com/r/plaintextaccounting/comments/1arkzfg/can_hledger_import_use_account_mappings_from/ :

hledger's CSV conversion rules look at each record in isolation, with no memory (except what you have encoded in the static rules). ... I agree this (inferring accounts, perhaps other characteristics, from similar-looking existing journal entries) is an interesting idea and could potentially work well, though it would also make conversion less predictable. Here is Ledger's https://ledger-cli.org/doc/ledger3.html#The-convert-command , which I assume you're referring to ? I can imagine feeling that you must write a rule for every description/payee is a bit unpleasant. I have never done that myself, it has been a more incremental process of adding a few rules each time, according to the latest transactions and my motivation for detailed categories, clean descriptions, etc. If someone is bulk converting a huge backlog of CSV, with high goals for categorising and cleaning, I can see it's a bigger job. Though in such cases, past journal entries might be nonexistent, or too different to help much. hledger's if table rules, possibly programmatically generated, might be some help.

Apparently Ledger does this and it's useful, so we should try it. I guess ideally it means you just write minimal CSV rules, then manually fix the uncategorised transactions each time you import, pretty soon you don't have to do that much any more, you're always doing that one thing and not switching context to edit rules, and the results are good and sufficiently reliable, at least for straightforward categorising cases.

Details to be clarified: