srendle / libfm

Library for factorization machines
GNU General Public License v3.0
1.49k stars 414 forks source link

Option to Save Predictive Model #1

Closed comadan closed 9 years ago

comadan commented 10 years ago

It would be great to have an option to save the predictive model after training. This way a trained model could be applied to a number of test sets without having to retrain.

thierry-silbermann commented 10 years ago

It is a hidden feature. You can access it with the method saveToBinaryFile(filename) And load the model weights with: loadFromBinaryFile(filename) Both in src/util/matrix.h

Shouldn't be too hard.

Tell me if you need any more help.

comadan commented 10 years ago

Thank you!

This is an incredible piece of machine learning kit by the way. Thank you for your work on this!!!

Dan

On Oct 1, 2014, at 12:27 PM, thierry-silbermann notifications@github.com wrote:

It is a hidden feature. You can access it with the method saveToBinaryFile(filename) And load the model weights with: loadFromBinaryFile(filename) Both in src/util/matrix.h

Shouldn't be too hard.

Tell me if you need any more help.

— Reply to this email directly or view it on GitHub.

comadan commented 10 years ago

Is there a way to access this method from the command line interface?

Thanks!

Dan

On Oct 1, 2014, at 12:27 PM, thierry-silbermann notifications@github.com wrote:

It is a hidden feature. You can access it with the method saveToBinaryFile(filename) And load the model weights with: loadFromBinaryFile(filename) Both in src/util/matrix.h

Shouldn't be too hard.

Tell me if you need any more help.

— Reply to this email directly or view it on GitHub.

thierry-silbermann commented 10 years ago

I didn't write any of the code on this repository so you should thanks Steffen Rendle for that, I'm just one of his PhD Student using it.

For now there is no way to access it from the command line interface and I don't think it is planned to change as it may impact a lot of things.

After more thinking, it might be not that easy to do some new prediction using a saved model. For your purpose, you have to be careful. Issues that I have from the top of my mind is:

1) Importing a new test might not work that easily because all the parameters are created during the training the first time (using train and test file). If you save, reload and import a new test data, you risk to create new parameter that were not initialize the first time and you will run into problem. (Just be sure that you have the same number of features each time)

2) if you are using MCMC method, it's completely different because then you need to save the parameters at each iteration (each chain) and not just the last one...

I think with some work you can do it but it's not going to be that simple in fact... For now, training a model each time is the simpler (but slower) way.

One more thing that can help is to use a seed to be able to replicate your experiment and not have different results each time you train a model. (I'll ask for a pull request soon on that)

twowheeler commented 10 years ago

For classifications, how are the scores converted into probabilities? Using the parameter outputs, I have a score of -0.401806. LibFM converts this into a probability of 0.345043, which is almost 0.343913, the value from the CDF of the normal distribution with mean 0 and std 1, but not exactly.

In the fm_learn_mcmc_simultaneous.h script, the code appears to simply do the conversion by taking p = cdf_gaussian(p);

Many thanks.

thierry-silbermann commented 10 years ago

Can you give me more details? (Like what are the option you are using, what do you mean by parameter outputs?)

And for question, please try to post here instead: https://groups.google.com/forum/#!forum/libfm

twowheeler commented 10 years ago

Thanks for responding so quickly. I've posted the question, with more details, in the Google Group.

Cheers