mbafford / plaid-sync

Command-line interface to the Plaid API that synchronizes your bank/credit card transactions with a local SQLite database. Written in Python 3.
MIT License
38 stars 5 forks source link

convenience table or view with queryable transaction data #3

Open ak2k opened 2 years ago

ak2k commented 2 years ago

As currently the items and transactions tables store much of the useful data in a json object, might you be open to additionally exposing the data elements more directly as columns for query convenience?

It would seem that this could be done without data duplication using either views on the current tables, or alternatively by adding 'GENERATED ALWAYS' (https://www.sqlite.org/gencol.html) columns to the source tables. In either case, this would introduce a dependency on the JSON1 module json_extract function, to the extent there isn't one already.

Although you may have the SQL statements already (for your downstream use), I would be happy to write/send these if you were open to the above.

Thanks!

mbafford commented 2 years ago

As long as the JSON is still available (since my workflow[1] depends on it), I'm open to changes like that - sounds like a useful improvement. I'd much rather have a single widely useful project than people needing to wade through a bunch of forks.

I'm not in a position to make changes to this project right now, but I'm happy to welcome any PRs that don't break existing (reasonable) workflows and seem generally useful.

The generated columns support is really cool. I don't know how I never encountered that in SQLite - i'm a huge proponent of SQLite in places other people would turn to a "real database" for querying purposes, but usually generate them from something else (like an Athena query) - but I can think of a lot of cases when this would have been useful. Thanks for the tip!

[1] I personally just use the data as a straight select * query in an importer for https://github.com/jbms/beancount-import and parse the JSON in Python, so I've never really cared that much about accessing the data from SQL queries - and when I do, I just use the json_extract method.

Xadoy commented 10 months ago

As long as the JSON is still available (since my workflow[1] depends on it), I'm open to changes like that - sounds like a useful improvement. I'd much rather have a single widely useful project than people needing to wade through a bunch of forks.

I'm not in a position to make changes to this project right now, but I'm happy to welcome any PRs that don't break existing (reasonable) workflows and seem generally useful.

The generated columns support is really cool. I don't know how I never encountered that in SQLite - i'm a huge proponent of SQLite in places other people would turn to a "real database" for querying purposes, but usually generate them from something else (like an Athena query) - but I can think of a lot of cases when this would have been useful. Thanks for the tip!

[1] I personally just use the data as a straight select * query in an importer for https://github.com/jbms/beancount-import and parse the JSON in Python, so I've never really cared that much about accessing the data from SQL queries - and when I do, I just use the json_extract method.

Sorry, a side topic, do you mind explaining how your workflow actually work? I am using beancount and beancount-import as well. Are you implementing the Source interface by providing the JSON? Thanks in advance!

mbafford commented 9 months ago

@Xadoy Here's my custom source for beancount-import that reads the sqlite database this utility creates:

https://gist.github.com/mbafford/ff2ece412ce3188f9eab692b3823e917

Hacked together from other beancount-import sources as the first source I wrote. There's likely improvements galore to make in this, but I've basically left it untouched for years with no issues.