srendle / libfm

Library for factorization machines
GNU General Public License v3.0
1.49k stars 414 forks source link

some confusion #36

Open TK-blost opened 5 years ago

TK-blost commented 5 years ago

hello, i am newer to use libFM, it was a great tool i used mcmc to train a CTR model, i met 2 pro 1) data has 160 million features, when init_V is small such as 0.001,0.005 it seem normally that auc is 0.6-0.7 but when i set init_V 0.1,0.5 the result just like 0,0,3333,1... i hope you give me some advice 2) i saw "if mcmc save model it will have to save param every iter" why only save last iter param is not ok ? and way the final y_predict is avg of evey iter.

i hope to receive some reply thanks!

srendle commented 5 years ago

about 2: sgd and als are point estimators that find the "best" model parameters, it is reasonable to use the parameters of the last iteration. MCMC is a sampling method that finds many probable model parameters. Just like it is not a good idea to use one of the decision trees out of a random forest, taking one of the MCMC models won't give a good prediction. Instead, the MCMC models produced by each "iteration" should be used collectively. about 1: Can you give some more details what you mean by "the result just like 0,0,3333,1..."? Is this the AUC or several predictions of the CTR model?

TK-blost commented 5 years ago

about 2: sgd and als are point estimators that find the "best" model parameters, it is reasonable to use the parameters of the last iteration. MCMC is a sampling method that finds many probable model parameters. Just like it is not a good idea to use one of the decision trees out of a random forest, taking one of the MCMC models won't give a good prediction. Instead, the MCMC models produced by each "iteration" should be used collectively. about 1: Can you give some more details what you mean by "the result just like 0,0,3333,1..."? Is this the AUC or several predictions of the CTR model?

oh, thank you so much for your reply, about 2 it is the out file of the prediction for TASK_CLASSIFICATION , i may find the reason that i have a lot feature which value is too large such as several thousand result in cache_e for pre_y is too large if init_stdev is large. thank you again for your reply.