jbms / beancount-import

Web UI for semi-automatically importing external data into beancount
GNU General Public License v2.0
395 stars 103 forks source link

Pluggable/determinstic processing feature request| #241

Open DarrenRiedlinger opened 4 weeks ago

DarrenRiedlinger commented 4 weeks ago

Would you be open to a feature or PR that would provide hooks for user_defined deterministic processing of matched transactions, thereby skipping the UI review? This would be similar to the need in #229 but extend it further.

In my use case, I have tons of transactions that are easily/reliably handled by a deterministic approach, but which take significant time to review. Many of these also end up as split transactions that aren't easily handled in the UI--but are easily handled in separate, deterministic code. However, they still benefit from using the source importers and existing/duplicate txn removal provided by this project. I'm currently handling this by doing an initial pre-import step in my launch script. That pre-import script 1) directly instantiates the Reconciler object, 2) gets it's computed list of pending_entries, 3) does some custom matching/processing on those pending_entries, and 4) then writes just those transactions it matched/processed out to the ledger with appropriate postings and metadata. I can then re-start everything using the normal web-ui script and any transactions I previously handled automatically get removed as existing transactions. This works, but seems a bit inelegant.

That said, even if you're open to a PR but I'm the one writing it, I'm a bit lost how to best integrate that into the existing code, so would need some pointers unless someone else takes that up. Ie. wrapping loaded_reconciler.pending_entries with a hook for some user-defined deterministic processing seems straightforward. But I'm a bit lost on how the change staging is getting done within the journal_editor.py and how to cleanly handle directly writing the matched/custom processed transactions to the ledger without messing anything else up. I'm currently just directly printing to the output file selected from reconciler.entry_file_selector using beancount.parser.printer.print_entry(entry), but not sure if that would play nice with the rest of the project code. Thanks.

Zburatorul commented 3 weeks ago

Hi @DarrenRiedlinger, I would welcome such a feature at least because I too have many deterministic transactions. I am not familiar with the part of the logic you have questions on. I am happy to review PRs and otherwise test.

DarrenRiedlinger commented 3 weeks ago

Thanks! I'll take a stab at it and submit a PR when I have a chance.

Zburatorul commented 2 weeks ago

Can you share your branch in the meantime? I'd love to take a look.

DarrenRiedlinger commented 2 weeks ago

Sure. I pushed what I currently have here: https://github.com/DarrenRiedlinger/beancount-import/tree/issue-241. Still need to finish the tests, but would appreciate any feedback on the approach. I'm currently applying the user transaction preprocessing right as each transaction is getting imported since that's where some of the existing filtering was already occurring and that works fine if the preprocessing can be done solely on a single imported transaction.

However, as I've been thinking about it more, it could ultimately be more flexible to wait and apply the custom preprocessing after all transactions have been imported--as that could allow more advanced custom logic based on all cleared + pending transactions. I.e. the basic transaction processing could still work the same, but you could include a reference to the reconciler when calling each custom transaction preprocess. So that custom transaction preprocess could then implement it's own more complicated matching amongst all the existing transactions + those being imported. It's not something I currently have a need for, but would make the custom preprocessing interface more flexible down the line.