ExaScience / smurff

Bayesian Factorization with Side Information in C++ with Python wrapper
MIT License
70 stars 14 forks source link

could not find a solution #128

Closed jordantkohn closed 4 years ago

jordantkohn commented 4 years ago

I've successfully trained a predictor with univariate sampler using MacauSession on my binarized matrix (values -1 and 1). But when I change univariate to False for a multivariate distr, the session fails with error: warning: blcok_cg: could not find a solution in 1000 iterations: residual: [-nan(ind) -nan(ind) ... ].all() > 1e

I also tried a slightly difference approach with TrainSession, and I get the same error. Do you know what could be causing this?

Thank you.

here is my code:

## first attempt
multi_train_session = smurff.MacauSession(
                       Ytrain     = my_sparse_train,
                       Ytest      = my_sparse_test,
                       side_info  = [traintest_sideinfo_sparse, None],
                       univariate = False,
                       num_latent = 16, 
                       burnin     = 5000,
                       nsamples   = 500,
                       threshold  = 0.0 ,
                       verbose    = 1)
multi_train_session= uni_train_session.run()

## second attempt
multi_train_session = smurff.TrainSession(
                       num_latent = 8, # might need to try less, more
                       burnin     = 500,
                       nsamples   = 100,
                       threshold  = 0.,
                       verbose    = 1)

multi_train_session.addTrainAndTest(my_sparse_train, my_sparse_test, smurff.ProbitNoise(0.))
multi_train_session.addSideInfo(0, traintest_sideinfo_sparse)
multi_train_predictions = multi_train_session.run()
tvandera commented 4 years ago

Give it a try with direct = True, as documented here https://smurff.readthedocs.io/en/release-0.16/api/training.html#macausession

jordantkohn commented 4 years ago

thanks!