Open g-eorge opened 5 years ago
This is a common characteristic of recommender systems.
I think your initial intuition of weighting your observations is on the right track: you can think of it as applying inverse propensity weights.
(Slight nitpick with terminology: this is implicit feedback data, but not really implicit ratings. I think it's best not to think in terms of ratings at all.)
Thanks for the insight!
@g-eorge may I ask whether you have experimented with the inverse propensity weighting and what your experiences are?
@g-eorge I am in a very similar situation (sparse matrix with some extremely popular items), this is leading the model to provide recommendations heavily centered around those few popular items.
I was wondering how did you approach weighting your observations and whether it actually improved the variety of your recommendations.
Thanks in advance.
@SimonCW @andodet unfortunately I haven't had time to experiment with any other ideas for dealing with the popular items yet. However, in my situation I have observed that WARP loss tends towards more popular items compared to BPR. This seems to make sense given that BPR is optimising AUC whereas WARP the precision @ k. For now we have decided that we are willing to accept more popular items in the results, so long as they are relevant for the user. Initial qualitative analysis suggests the recommendations are mostly relevant, so we are deferring the problem for now. Sorry I don't have any more concrete advice to share, I will let you know if I find anything.
I implemented the inverse popularity weight for positive interactions by "counting the portion of the users who have been exposed to item ... the score is fixed across users". (Liang 2016) I'm confident to say this is correct approach. Because after the implementation, the recommendations are more diverse, it seems the precision@k dropped when k is small.
If model still predicts similar (if not almost identical) items for every user, posible remedies may lies in the user/item feature matrices (https://github.com/lyst/lightfm/issues/497#issuecomment-601268061, https://github.com/lyst/lightfm/issues/320#issuecomment-401329449), stop using logistic loss and item bias, etc.
@SantiDu - Can you please share an example of how you implemented the Inverse popularity weight?
@robonidos Sure, | --- | item1 | item2 | item3 |
---|---|---|---|---|
user1 | 1 | 0 | 1 | |
user2 | 0 | 0 | 1 | |
user3 | 0 | 0 | 0 |
The portion of the users who have been exposed to item3 is 2/3, whose inverse is 3/2; for item1 is inv(1/3), the weights would be | --- | item1 | item2 | item3 |
---|---|---|---|---|
user1 | 3 | 0 | 3/2 | |
user2 | 0 | 0 | 3/2 | |
user3 | 0 | 0 | 0 |
The weights can't be too large so I scale them.
@SantiDu Thank you for the explanation. I have a binary interaction data between users and items. In this case, would you recommend scaling the weights between 0 to 1?
@robonidos I'm not sure, but I guess model with smaller weights takes more epochs to train, and 0 weights doesn't make sense.
df_user_item_interaction = df_user_item_interaction .groupby('Item')['User_Rating'].sum().reset_index()
df_user_item_interaction ['rating_ipw'] = df_user_item_interaction.user_id.nunique()/df_user_item_interaction ['User_Rating']
I faced the problem of "extremely popular items" in my predictions too and made a workaround as suggested in this comment - simply not using the item_biases
from fitted model in predictions.
My case:
# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()
# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)
# combine user_factors with biases for dot product
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)
scores = user_factors.dot(item_factors.T)
This gave me most popular items among top score items for my users.
Simply not using item_biases
in dot product helped to get more diverse recommendations in top score items:
# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()
# combine item_factors only with user biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)
# combine user_factors with biases for dot product
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)
scores = user_factors.dot(item_factors.T)
May be this trick would be helpful for others =)
@mike-chesnokov That sounds like a good approach. Do you think that simply multiplying the biases by some 'dampening factor' would work as well? I am thinking of producing recommendations at varying bias levels and pick the ones that look more sensible.
@andodet I tried "discount" factor for item_biases
for my case and got different result from top recommendations.
So to my mind "discount" factor could work well, but a small minus - using it you get another model parameter to tune.
@mike-chesnokov It would be indeed another parameter to tune. On the other hand I'd probably use it in the exploration phase (e.g in a notebook) as the final call has to be done visually inspecting recommendations. I don't think the approach lends itself to automation too well but maybe I am missing something.
@mike-chesnokov can I achieve the same results by setting item_biases to zeros like this:
model.item_biases = np.zeros_like(model.item_biases)
then use the LightFM predict method?
model.predict()
@amrakm I didn't use this method, but you can calculate predictions with 2 different methods (model.predict()
and dot product
) and compare - both should give you equal results.
@mike-chesnokov I realize I'm a bit late here, but I have to ask why you're incorporating user biases in the proposed prediction routine? If I'm trying to predict items per user, then the user biases don't matter in creating a new ranking of items. This does change the absolute value of the scores but as I've read elsewhere the scores aren't so meaningful.
I propose the following instead, as a more efficient method if you're just looking for ranked items per user:
# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()
# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)
# add ones to user_factors for item bias
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)
scores = user_factors.dot(item_factors.T)
item_inds = np.argsort(-scores)
Since this thread gave me great choices to address popularity bias, I will share with you my experience trying out one of the solutions mentioned here which is Inverse popularity weight elaporated above.
In my use case the interactions is simply 1 or 0, (user bought the item or not), some items are far popular than others.
After I applied the weighting, I found that some weights are too large (for the items which are not popular), then it's esstial to scale the values back to smaller numbers, but scaling the weights between 0, and 1 has 2 drawbacks:
so I scaled the values between (0.001, 1), and it worked out for me very well, also you can apply maximum normalization (divide by maximum value) this way zero weights will be avoided too.
I faced the problem of "extremely popular items" in my predictions too and made a workaround as suggested in this comment - simply not using the
item_biases
from fitted model in predictions.My case:
* WARP loss * sparse binary user-item interactions matrix (0 or 1 in it) * no user, no item features * predicted recommendations with dot product of user and item embeddings and biases (like in the code below - concatenated biases and embeddings)
# load latent representations item_biases, item_factors = model.get_item_representations() user_biases, user_factors = model.get_user_representations() # combine item_factors with biases for dot product item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1) item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1) # combine user_factors with biases for dot product user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1) user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1) scores = user_factors.dot(item_factors.T)
This gave me most popular items among top score items for my users.
Simply not using
item_biases
in dot product helped to get more diverse recommendations in top score items:# load latent representations item_biases, item_factors = model.get_item_representations() user_biases, user_factors = model.get_user_representations() # combine item_factors only with user biases for dot product item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1) # combine user_factors with biases for dot product user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1) scores = user_factors.dot(item_factors.T)
May be this trick would be helpful for others =)
hi @Phylake1337 that worked for me 100%. However I am facing an issue, how about the users that are new to the model? Did you face the same situation using this approach? any help would be well appreciated (y)
Hi @mike-chesnokov this approach worked 100% for me thank you. Currently I am trying to use the same approach but with those users who have not yet provided a rating. Have you faced the same situation using this approach and be able to do recommendations for new clients?
Hello! Thanks for making this great framework available.
In my scenario, I have user-item interaction data which I use to create implicit ratings, currently without user or item features. The user-item interaction data is very sparse and there are also some items that are extremely popular relative to others, so the predictions generally look like a list of most popular items.
So I am wondering what approach would you suggest to create more diversity and long-tail items in the predictions?
It seems to me a naive approach could be to weight the implicit ratings relative to the global popularity of the item. But maybe including user and item features circumvents this whole popularity bias problem?
Thanks in advance!