Closed shuhcl closed 1 year ago
Hi, it looks like your code creates some users with 0 ratings. I will add a better error message to recommenderlab. To fix the problem, you need to remove users without any ratings:
> # simulate matrix with 1000 users and 100 movies
> m <- matrix(nrow = 2000, ncol = 100)
>
> # simulated ratings (5% of the data)
> m[sample.int(100 * 2000, 10000)] <- ceiling(runif(1000, 0, 5))
>
> # remove users with no ratings
> (noratings <- which(rowSums(!is.na(m)) == 0))
[1] 88 233 381 603 641 747 1063 1165 1433 1437 1688
> m <- m[-noratings, ]
>
> # convert into a realRatingMatrix
> r <- as(m, "realRatingMatrix")
>
> # UBCF recommender
> UB.Rec <- Recommender(r, method = "UBCF")
>
> pred <- predict(UB.Rec, r, type = "ratings")
Hi Michael, Thank for your suggestion. Given a large sparse matrix, most of users only rate a few products. When using cross-validation to evaluate models, it is very likely that the training data may include some users who do not have any ratings. Are there any methods to automatically exclude those users in evaluationScheme()?
# simulate matrix with 1000 users and 100 movies
m <- matrix(nrow = 2000, ncol = 100)
# simulated ratings (5% of the data)
m[sample.int(100 * 2000, 10000)] <- ceiling(runif(1000, 0, 5))
# remove users with no ratings
noratings <- which(rowSums(!is.na(m)) == 0)
m <- m[-noratings, ]
# convert into a realRatingMatrix
r <- as(m, "realRatingMatrix")
# UBCF recommender
UB.Rec <- Recommender(r, method = "UBCF")
pred <- predict(UB.Rec, r, type = "ratings")
# evaluation
eval_sets <- evaluationScheme(r,
method = "cross-validation",
k = 4,
given = 1)
eval_result <- evaluate(eval_sets, method = "UBCF")
# users with no ratings in the known data
known <- getData(eval_sets, "known")
known@data[which(rowSums(known) == 0)]
Thank you! This can be a problem so I reopen this issue. I think in my examples I make sure that I only use users with more than given
items. Since you use given = 1
, you would need to make sure that all users have at least 2 ratings. Here is the changed code that should work:
# simulate matrix with 1000 users and 100 movies
m <- matrix(nrow = 2000, ncol = 100)
# simulated ratings (5% of the data)
m[sample.int(100 * 2000, 10000)] <- ceiling(runif(1000, 0, 5))
### FIXME: needs dimnames
dimnames(m) <- list(seq(nrow(m)), seq(ncol(m)))
### FIXME: check number of ratings
# remove users with no ratings
not_enough_ratings <- which(rowSums(!is.na(m)) < 2)
m <- m[-not_enough_ratings, ]
# convert into a realRatingMatrix
r <- as(m, "realRatingMatrix")
# UBCF recommender
UB.Rec <- Recommender(r, method = "UBCF")
pred <- predict(UB.Rec, r, type = "ratings")
# evaluation
eval_sets <- evaluationScheme(r,
method = "cross-validation",
k = 4,
given = 1,
goodRating = 3)
eval_result <- evaluate(eval_sets, method = "UBCF")
# users with no ratings in the known data
known <- getData(eval_sets, "known")
known@data[which(rowSums(known) == 0)]
I will work on the code to check for this and either fix the issue by dropping users with not enough ratings or producing a better error message.
I have updated the code and it should now automatically remove users with not enough ratings.
Hello Michael, Thank you for this amazing package! There seems to be a bug about sparse matrix multiplication within the predict() function. I'm working on the below code, which keeps reporting an error on the predict() function. any ideas about how to deal with it?
sessioninfo