isaacharrisholt / quiffen

Quiffen is a Python package for parsing QIF (Quicken Interchange Format) files.
MIT License
34 stars 30 forks source link

Splits validator not robust to many splits with rounding error #43

Closed quenette closed 1 year ago

quenette commented 1 year ago

When transactions have a sufficient number of splits, the two-decimal point rounding error will be big enough to exceed the 0.01 tolerance set in the validator (line 172 / transaction.py / current head). My hack is to change the tolerance to 0.5 (taking the view that 100.5 rounds up to 101%, which is not 100%, and it would take 100 splits with a rounding error to get there). A more robust fix would be to treat the percentage as a float. But before changing any code and submitting, and noting you may fix this yourself, I thought I'd ask you all your thoughts - what / which approach would you prefer?

isaacharrisholt commented 1 year ago

I've been thinking about this, and the solution doesn't come to mind immediately. I'm hesitant to change to a float for compatibility reasons.

Perhaps a global flag either changing the tolerance or enabling/disabling that check could work, but that might be adding unnecessary complexity.

Raising the tolerance is probably my preferred solution for now, though it would be worth pulling the tolerance out into a constant since it's used in a couple of places. If you're happy to make this change, that would be much appreciated!

quenette commented 1 year ago

Thanks for the prompt consideration Isaac.

I’m noticing a few other little things that might be of a similar nature. I might spend time over the holidays exploring more and report back.

Steve

On 21 Dec 2022 at 8:31:17 pm, Isaac Harris-Holt @.***> wrote:

I've been thinking about this, and the solution doesn't come to mind immediately. I'm hesitant to change to a float for compatibility reasons.

Perhaps a global flag either changing the tolerance or enabling/disabling that check could work, but that might be adding unnecessary complexity.

Raising the tolerance is probably my preferred solution for now, though it would be worth pulling the tolerance out into a constant since it's used in a couple of places. If you're happy to make this change, that would be much appreciated!

— Reply to this email directly, view it on GitHub https://github.com/isaacharrisholt/quiffen/issues/43#issuecomment-1361069115, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKA22C2DOVA7GGJDNLB2KTWOLE6LANCNFSM6AAAAAATFI3RNU . You are receiving this because you authored the thread.Message ID: @.***>

isaacharrisholt commented 1 year ago

Thanks, that's awesome!

It's always great to have people look over my crappy code 😂

isaacharrisholt commented 1 year ago

Hey, did you manage to find any more issues, or think of a potential solution to the problem?

quenette commented 1 year ago

Although I I haven’t tried it, another quick way to fix the prior problem is to change the Decimal in this case alone to have several digits more accuracy (e.g. 4 or 5 instead of 2)

The only other place I have an issue is in the category parsing code. Banktivity uses the field to host both the categories and the tags. And the meaning of ‘/‘ hence changes. My clone of the repos comments out the “if '/' in category_string:” statement. I haven’t settled on any advice here yet. The simplest would be to say “quiffen” just pass me the categories un parsed. However a better answer will be to know the field bundles both categories and tags (both are structurally the same).

Superannuation accounts don’t load either (unknown type somewhere), but haven’t yet looked into it.

Other than those three, I am able to parse a vast number of transactions, and do the ML that is actually my interests here.

As its only a part-time thing for me, it will take a bit of time to get around to changing quiffen itself.

S

On 4 Feb 2023 at 8:12:12 pm, Isaac Harris-Holt @.***> wrote:

Hey, did you manage to find any more issues, or think of a potential solution to the problem?

— Reply to this email directly, view it on GitHub https://github.com/isaacharrisholt/quiffen/issues/43#issuecomment-1416704642, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKA22CO3TKYL4JNENUF3ELWVYMOZANCNFSM6AAAAAATFI3RNU . You are receiving this because you authored the thread.Message ID: @.***>

isaacharrisholt commented 1 year ago

Awesome, well I'm glad to know it's mostly working okay for you. It's unlikely that I'll put in the work to support categories and tags in the same line in that way, as the / character represents classes according to the specification.

I'll take another look at fixing the rounding error, but it might be helpful if you could provide an example transaction or two for me to test against.

isaacharrisholt commented 1 year ago

Closing due to inactivity.