Closed filipwastberg closed 5 years ago
So I figured this out. It is pretty straight forward. Since we use the encoded item-variable iid
in the item_encoding
data.frame when modelling the items_exclude
vector needs to contain iid
.
For example I wanted only these artists to be recommended:
on_sale <- c("the killers", "red hot chili peppers", "bloc party", "nofx", "the smiths", "dean martin", "madonna", "belle and sebastian", "britney spears", "spiritualized", "coldplay", "u2", "in flames", "the smashing pumpkins",
"radiohead", "morrissey", "rush", "kylie minogue", "kate bush", "lady gaga")
I create the items_exclude
from item_encoding
:
items_exclude <- item_encoding[!(artist_name %in% on_sale)]
items_exclude <- items_exclude$iid
Then we can predict artists for a user and save it to a data.frame:
new_user_predictions <- model$predict(X_cv_future[2:2, , drop = FALSE],
not_recommend = NULL,
items_exclude = items_exclude,
k = 20)
user_id <- attr(new_user_predictions, "dimnames")
scores <- as.data.frame(attr(new_user_predictions, "scores"))
scores$user_id<- user_id[[1]]
scores <- as.data.table(melt(scores))[, .(user_id, score = value)]
artist <- as.data.table(melt(attr(new_user_predictions, "ids")))[
, .(user_id = Var1, artist_name = value)]
artist <- artist[, artist_name := as.character(artist_name)]
export <- cbind(scores, artist)[
order(user_id, -score), .(user_id, artist_name, score)]
export
user_id artist_name score
1: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc bloc party 1.0156563
2: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc radiohead 0.9866832
3: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc belle and sebastian 0.9815670
4: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc the smiths 0.9721361
5: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc the killers 0.9537831
6: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc coldplay 0.9488346
7: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc the smashing pumpkins 0.9475210
8: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc morrissey 0.9432581
9: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc red hot chili peppers 0.8897018
10: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc u2 0.8870862
11: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc madonna 0.7219695
12: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc nofx 0.7113644
13: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc britney spears 0.6474762
14: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc spiritualized 0.6349998
15: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc kate bush 0.6251492
16: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc lady gaga 0.5709979
17: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc kylie minogue 0.5597064
18: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc rush 0.4292514
19: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc in flames 0.4156308
20: 4b5ffa7d5485294b81d3c965efaa613f6925c6cc dean martin 0.2178573
Hi!
Thanks for a really impressive package.
In my world there are two common scenarios when building recommendation systems. You either want to recommend products that a customer has never liked (or bought) from your whole catalogue or you want to recommend products from a subset of the catalogue, e.g. products that are discounted. Most implementations of collaborative filtering focus on the first scenario. My question is how to use the
item_exclude
to tackle the second scenario. This is somewhat related to a previous issueFor instance, say that we have 60 artists whose album are on sale in the
lastfm
dataset that we want to recommend.Example code from: http://dsnotes.com/post/2017-06-28-matrix-factorization-for-recommender-systems-part-2/
Here I'll sample 60 artists "on sale" and create a table of items to exclude from the predictions.
Below are some data manipulation to put data in a sparse matrix.
Here we fit the model.
Now, I want to recommend only the artists that are on sale, so I pass the excluded artists to the
items_exclude
argument.However, these recommendations are not the ones on sale?
I suppose this would be clearer for me with a vignette, that I can see is on its way, however, in the meanwhile, how should one use the
item_exclude
argument?Furthermore, say we want to maximize the recommendations here, i.e. put
k = 60
, would that work for multiple users?