Interpreting negative scores from lightfm.LightFM.predict()

lyst / lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.

Apache License 2.0

4.77k stars 691 forks source link

Interpreting negative scores from lightfm.LightFM.predict() #530

Closed freytheviking closed 4 years ago

freytheviking commented 4 years ago

I saw some explanation from here regarding interpreting negative scores from the model.predict() method but I wanted to clarify a few points with the experts just for everyone to see as well.

I understand that the predicted scores don't really mean anything but only used as a means to rank. Let's say I have called lightfm.LightFM.predict() like this:

predictions = model.predict(
    user_ids=np.array([0, 0, 0, 0, 0]),
    item_ids=np.array([0, 1, 2, 5, 100]),
    item_features=feature_matrix
)

This means that I am predicting the score for user 0 on 5 items, namely 0, 1, 2, 5, 100. Let's say that my result is:

array([ 2.79359961,  2.76859665, -6.60331917, -0.56102526,  1.27920794])

Does this mean that for user 0, we would predict item 3 (-6.60331917) to be the top recommendation , followed by item 4 (-0.56102526)... and item 1 (2.79359961) to be the lowest recommendation?

Thanks in advance!

EthanRosenthal commented 4 years ago

Almost. Everything you said is right, but the conclusion at the end is actually the reverse -- the top recommendation is the item with the highest predict score, not the lowest. In your case, item 1 would be the top recommendation, and item 3 would be the lowest.

freytheviking commented 4 years ago

Thanks @EthanRosenthal!

freytheviking commented 4 years ago

As a follow-up question, @EthanRosenthal, are these scores "local" to an individual user or are they "global". For the above example, are these scores just used to rank items for user 0 or can these scores be compared to other users?

Say I do this, and additionally rank items 3 and 4 for user 1

predictions = model.predict(
    user_ids=np.array([0, 0, 0, 0, 0, 1, 1]),
    item_ids=np.array([0, 1, 2, 5, 100, 3, 4]),
    item_features=feature_matrix
)

and the result is this:

array([ 2.79359961,  2.76859665, -6.60331917, -0.56102526,  1.27920794,
       -0.49498647, -2.51921558])

Can I say that the recommendation made for user 0 on item 0 (2.79359961) is "stronger" than the recommendation made for user 1 on item 3 (-0.49498647)?

EthanRosenthal commented 4 years ago

The scores are local to the individual user, so unfortunately you can't compare scores between users. You're correct in what you said -- those scores are only used for ranking items for user 0.

SantiDu commented 4 years ago

@EthanRosenthal And they can be used to rank users for item 0. https://github.com/lyst/lightfm/issues/524#issue-585492540, but not global?

dimitry12 commented 1 year ago

Was curious about this as well. My thought process:

LightFM algorithm is symmetric wrt items/users;
correspondingly, if it provides ordering of items for a user, it also provides ordering of users for an item;
given items $X$, $Y$ and users $A$, $B$ we can:
- rank items for user $A$ into $r_X^A, r_Y^A$
- and rank users for item $X$ into $r_A^X, r_B^X$
$r_X^A$ must be equal to $r_A^X$ because they are just just matmuls of the same user/item embeddings

With that, the ranking-coefficients must provide ordering over all user-item pairs (aka interactions). Why not?