akopulko / ffiiitc

FireFly III Transactions Classifier
MIT License
37 stars 9 forks source link

Limit the number of initial transactions for training #5

Open akopulko opened 1 year ago

akopulko commented 1 year ago

Right now ffiiitc takes all transactions and does training. This is not efficient and make be overkill as most of transactions are repeating month to for average person. Idea is to limit number of transactions to 2-3 months when performing initial training.

Jack-XHP commented 11 months ago

Actually, I don't see training the classifier is computationally expensive. With my Firefly, I have about 1.5 years for all transactions from all my banks, investments etc ( about 10000 transactions), training the classifier only takes less than 5 seconds on an i5 8400 cpu.

akopulko commented 11 months ago

yeah even on mine m1 macbook it seems ok with lots of transactions, but we need a bigger user base to understand if limits are required or now. This is more of a long term enhancement.

Jack-XHP commented 10 months ago

Ok I think letting users to set a time range would also be a good way. Both data and limit number are supported by firefly 3 api.

d-tork commented 1 month ago

My firefly instance has 14 years of data. Training the model goes through ~220 pages and takes some time, enough that it's something I start and then go do something else for a half hour. But I don't think my computer specs matter in the discussion.

This is a good suggestion simply because data spanning very long periods of time (like mine) are going to change over time. Whether I have switched banks and credit cards, or if merchants have changed the way their transactions are labeled, or (in my case) many years of data were human-entered descriptions and now they are auto-generated from the credit card company.

Merely a command-line flag or an option in the config to "retrain based on last N months" would be sensible and maybe even easy, though that's not for me to say since I'm not volunteering to make the PR right now. Though this would be as good a time as any for me to learn Go and familiarize with the firefly API!