lyst / lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.
Apache License 2.0
4.66k stars 679 forks source link

Model prediction scores are zero for all user-item pairs #662

Open zhenliu2012 opened 1 year ago

zhenliu2012 commented 1 year ago

Hi,

When I use the lightfm.predict method to predict scores for a fairly large number of users (users ~ 500k, items ~ 5k), the prediction scores returned by lightfm.predict are consistently zero for all user-item pairs, ie. np.array([0.0, 0.0, ...]). However, when I try to predict for only a small number of users (~ 500 selected from total users), the scores become non zero, ie. np.array([-54.321, -53.298, ...]). This is the code I used to calculate scores:

scores = model.predict(
        user_ids=np.repeat(users, n_items),
        item_ids=np.tile(items, n_users),  
        user_features=user_features_mat,
        item_features=item_features_mat,
    )

where users is an np.array containing user_ids [0, 2, 3, 4, 6 ..], items is an np.array containing item_ids [11, 12, 34, 66, ..]. I use np.repeat and np.tile to properly create arrays matching the user-item pairs for prediction. n_users and n_items are the number of users and items, respectively.

The reason I'm predicting scores for a large number of users is that I want to get the rank of several particular items against all other items for each selected user. I'm aware of the predict_rank method but it's very slow, so I'm trying to replicate that with the predict method, which I hope would be much faster.

Anyone seen this type of behaviors before? Any help is much appreciated! Thanks in advance

vkurichenko commented 1 year ago

Hi!

I just faced the same problem. Also used np.repeat and np.tile. Have you found the solution?

sypan commented 7 months ago

Hi,

I faced the same problem as well. Any luck with any solution?

vkurichenko commented 7 months ago

Hi,

I faced the same problem as well. Any luck with any solution?

Hi! As for me, I just ended up using multiprocessing for prediction of scores for each user separately.

sypan commented 7 months ago

Hi vkurichenko,

Thanks for replying me. I was trying joblib for prediction and it gives me ValueError: buffer source array is read-only. I may try multiprocessing now :)