lyst / lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.
Apache License 2.0
4.74k stars 691 forks source link

How to handle very popular items? #395

Open g-eorge opened 5 years ago

g-eorge commented 5 years ago

Hello! Thanks for making this great framework available.

In my scenario, I have user-item interaction data which I use to create implicit ratings, currently without user or item features. The user-item interaction data is very sparse and there are also some items that are extremely popular relative to others, so the predictions generally look like a list of most popular items.

So I am wondering what approach would you suggest to create more diversity and long-tail items in the predictions?

It seems to me a naive approach could be to weight the implicit ratings relative to the global popularity of the item. But maybe including user and item features circumvents this whole popularity bias problem?

Thanks in advance!

maciejkula commented 5 years ago

This is a common characteristic of recommender systems.

I think your initial intuition of weighting your observations is on the right track: you can think of it as applying inverse propensity weights.

(Slight nitpick with terminology: this is implicit feedback data, but not really implicit ratings. I think it's best not to think in terms of ratings at all.)

g-eorge commented 5 years ago

Thanks for the insight!

SimonCW commented 5 years ago

@g-eorge may I ask whether you have experimented with the inverse propensity weighting and what your experiences are?

andodet commented 5 years ago

@g-eorge I am in a very similar situation (sparse matrix with some extremely popular items), this is leading the model to provide recommendations heavily centered around those few popular items.

I was wondering how did you approach weighting your observations and whether it actually improved the variety of your recommendations.

Thanks in advance.

g-eorge commented 5 years ago

@SimonCW @andodet unfortunately I haven't had time to experiment with any other ideas for dealing with the popular items yet. However, in my situation I have observed that WARP loss tends towards more popular items compared to BPR. This seems to make sense given that BPR is optimising AUC whereas WARP the precision @ k. For now we have decided that we are willing to accept more popular items in the results, so long as they are relevant for the user. Initial qualitative analysis suggests the recommendations are mostly relevant, so we are deferring the problem for now. Sorry I don't have any more concrete advice to share, I will let you know if I find anything.

SantiDu commented 4 years ago

I implemented the inverse popularity weight for positive interactions by "counting the portion of the users who have been exposed to item ... the score is fixed across users". (Liang 2016) I'm confident to say this is correct approach. Because after the implementation, the recommendations are more diverse, it seems the precision@k dropped when k is small.

If model still predicts similar (if not almost identical) items for every user, posible remedies may lies in the user/item feature matrices (https://github.com/lyst/lightfm/issues/497#issuecomment-601268061, https://github.com/lyst/lightfm/issues/320#issuecomment-401329449), stop using logistic loss and item bias, etc.

rohit-u2 commented 4 years ago

@SantiDu - Can you please share an example of how you implemented the Inverse popularity weight?

SantiDu commented 4 years ago
@robonidos Sure, --- item1 item2 item3
user1 1 0 1
user2 0 0 1
user3 0 0 0
The portion of the users who have been exposed to item3 is 2/3, whose inverse is 3/2; for item1 is inv(1/3), the weights would be --- item1 item2 item3
user1 3 0 3/2
user2 0 0 3/2
user3 0 0 0

The weights can't be too large so I scale them.

rohit-u2 commented 4 years ago

@SantiDu Thank you for the explanation. I have a binary interaction data between users and items. In this case, would you recommend scaling the weights between 0 to 1?

SantiDu commented 4 years ago

@robonidos I'm not sure, but I guess model with smaller weights takes more epochs to train, and 0 weights doesn't make sense.

rohit-u2 commented 4 years ago

Inverse Propensity Weight calculation for ratings/binary interaction, to handle very popular items -

df_user_item_interaction = df_user_item_interaction .groupby('Item')['User_Rating'].sum().reset_index()

deriving the inverse propensity weight rating value

df_user_item_interaction ['rating_ipw'] = df_user_item_interaction.user_id.nunique()/df_user_item_interaction ['User_Rating']

mike-chesnokov commented 4 years ago

I faced the problem of "extremely popular items" in my predictions too and made a workaround as suggested in this comment - simply not using the item_biases from fitted model in predictions.

My case:

# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()

# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)

# combine user_factors with biases for dot product  
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)

scores = user_factors.dot(item_factors.T)

This gave me most popular items among top score items for my users.

Simply not using item_biases in dot product helped to get more diverse recommendations in top score items:

# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()

# combine item_factors only with user biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)

# combine user_factors with biases for dot product  
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)

scores = user_factors.dot(item_factors.T)

May be this trick would be helpful for others =)

andodet commented 4 years ago

@mike-chesnokov That sounds like a good approach. Do you think that simply multiplying the biases by some 'dampening factor' would work as well? I am thinking of producing recommendations at varying bias levels and pick the ones that look more sensible.

mike-chesnokov commented 4 years ago

@andodet I tried "discount" factor for item_biases for my case and got different result from top recommendations.

So to my mind "discount" factor could work well, but a small minus - using it you get another model parameter to tune.

andodet commented 4 years ago

@mike-chesnokov It would be indeed another parameter to tune. On the other hand I'd probably use it in the exploration phase (e.g in a notebook) as the final call has to be done visually inspecting recommendations. I don't think the approach lends itself to automation too well but maybe I am missing something.

amrakm commented 3 years ago

@mike-chesnokov can I achieve the same results by setting item_biases to zeros like this:

model.item_biases = np.zeros_like(model.item_biases)

then use the LightFM predict method?

model.predict()

mike-chesnokov commented 3 years ago

@amrakm I didn't use this method, but you can calculate predictions with 2 different methods (model.predict() and dot product) and compare - both should give you equal results.

dpalbrecht commented 2 years ago

@mike-chesnokov I realize I'm a bit late here, but I have to ask why you're incorporating user biases in the proposed prediction routine? If I'm trying to predict items per user, then the user biases don't matter in creating a new ranking of items. This does change the absolute value of the scores but as I've read elsewhere the scores aren't so meaningful.

I propose the following instead, as a more efficient method if you're just looking for ranked items per user:

# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()

# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)

# add ones to user_factors for item bias
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)

scores = user_factors.dot(item_factors.T)
item_inds = np.argsort(-scores)
Phylake1337 commented 1 year ago

Since this thread gave me great choices to address popularity bias, I will share with you my experience trying out one of the solutions mentioned here which is Inverse popularity weight elaporated above.

In my use case the interactions is simply 1 or 0, (user bought the item or not), some items are far popular than others.

After I applied the weighting, I found that some weights are too large (for the items which are not popular), then it's esstial to scale the values back to smaller numbers, but scaling the weights between 0, and 1 has 2 drawbacks:

  1. most of the weights are too small, so the training will take forever (300 epochs in my case was not enough to fit the data)
  2. some of the weights were 0 which is not sensible.

so I scaled the values between (0.001, 1), and it worked out for me very well, also you can apply maximum normalization (divide by maximum value) this way zero weights will be avoided too.

dataversenomad commented 1 year ago

I faced the problem of "extremely popular items" in my predictions too and made a workaround as suggested in this comment - simply not using the item_biases from fitted model in predictions.

My case:

* WARP loss

* sparse binary user-item interactions matrix (0 or 1 in it)

* no user, no item features

* predicted recommendations with dot product of user and item embeddings and biases (like in the code below - concatenated biases and embeddings)
# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()

# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)

# combine user_factors with biases for dot product  
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)

scores = user_factors.dot(item_factors.T)

This gave me most popular items among top score items for my users.

Simply not using item_biases in dot product helped to get more diverse recommendations in top score items:

# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()

# combine item_factors only with user biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)

# combine user_factors with biases for dot product  
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)

scores = user_factors.dot(item_factors.T)

May be this trick would be helpful for others =)

hi @Phylake1337 that worked for me 100%. However I am facing an issue, how about the users that are new to the model? Did you face the same situation using this approach? any help would be well appreciated (y)

dataversenomad commented 1 year ago

Hi @mike-chesnokov this approach worked 100% for me thank you. Currently I am trying to use the same approach but with those users who have not yet provided a rating. Have you faced the same situation using this approach and be able to do recommendations for new clients?