ibayer / fastFM

fastFM: A Library for Factorization Machines
http://ibayer.github.io/fastFM
Other
1.08k stars 204 forks source link

OverflowError: n_iter too high in bpr.FMRecommender #147

Open rhjohnstone opened 5 years ago

rhjohnstone commented 5 years ago

When I try to set a massive number of iterations, e.g.

fm = bpr.FMRecommender(n_iter=10000000000000, init_stdev=0.1, step_size=0.01, rank=10, random_state=123)
fm.fit(X, compares)

I get as an error (which occurs when the fit method is called):

OverflowError: value too large to convert to int

I also tried calling the fit method twice with n_iter=100 to see if I could train in batches, but found the output was the same after the first 100 as after the second 100, so I suppose fit also resets everything?

Is the maximum possible n_iter related to the size of the training data?

And is there any way to continue training the bpr.FMRecommender after n_iter iterations have been performed?

ibayer commented 5 years ago

Is the maximum possible n_iter related to the size of the training data?

No, n_iter is really the number of individual sgd updates (not epoch). This is indeed the reason why even fairly large n_iter might result in very few passes over the training data if you have a lot of samples.

so I suppose fit also resets everything?

This is indeed true, some solver support fit(X,y, warm_start=Ture) but the sgd solver doesn't.

And is there any way to continue training the bpr.FMRecommender after n_iter iterations have been performed?

No, see above.

A new BPR / SGD implementation is on it's way to solve this issues but it might take a while till we have it ready for release.