beancount / smart_importer

Augment Beancount importers with machine learning functionality.
MIT License
248 stars 29 forks source link

Empty narration but payee leads to wrong posting #39

Closed tbm closed 5 years ago

tbm commented 6 years ago

I editedtests/predict_postings_test.py to add some more test cases. I added the following to test an empty narration:

                2017-01-12 * "Uncle Boons" ""
                  Assets:US:BofA:Checking  -27.00 USD

It predicted Expenses:Food:Groceries even though looking at the training data it seems clear to me that the result should be Expenses:Food:Restaurant.

tbm commented 6 years ago

I've never used machine learning but this post suggests that you should not assign weights manually: https://stackoverflow.com/questions/38034702/how-to-put-more-weight-on-certain-features-in-machine-learning

Removing the weight assignment makes my test case work.

tbm commented 6 years ago

that is, predict_postings works. predict_payees fails after removing weights.

tbm commented 6 years ago

With this change, removing all weights also works for payee:

-                        ('payee', Pipeline([  # any existing payee, if one exists
-                            ('getPayee', ml.GetPayee()),
-                            ('vect', CountVectorizer(ngram_range=(1, 3))),
-                        ])),

I'm not sure if this has any negative effects though. It seems a bit weird that removing this makes it work. Then again how can the payee be used to match the payee? If the payee is set in test_data, it's taken as is anyway (which seems wrong to me).

Anyway, I've no idea what I'm doing. I'm just tapping around in the dark ;)

tbm commented 6 years ago

https://gist.github.com/tbm/ff5fa6f2f845457afbf86f0ed6f2ea55

johannesjh commented 6 years ago

Removing weights: I think we set weights based on a subjective evaluation of the relative importance of input data. We certainly did no formal evaluation. I remember seeing examples online for how to tune the weights. I googled for it and found these:

...that is probably an approach we could take, without too much effort. If the grid search takes a long time, we could persistently cache optimized weights for a given importer configuration. I think this would certainly be an improvement.

Note: I have not considered optimized weights to be particularly important though. I previously tried reckon and found it worked well despite the much simpler Bayes'ian classifier used in its implementation.

Payee as input for classifying payees: I think this does make sense in situations where the intended output is similar but not identical to training data. For example, the output of an importer may include payee fields with proper payee information in a messy format. A classifier can be trained based on this information to suggest and/or predict cleanly formatted and consistent payees.

Failing test case: Can this be due to the very small training data set?

johannesjh commented 5 years ago

The recent refactoring by @yagebu now made it very easy to tune weights as you wish. I guess this mostly solves this issue.

@tbm: Could you try the current version please and see if it works for you.

tbm commented 5 years ago

Yeah, adding a test case with an empty narration works now.