mhahsler / recommenderlab

recommenderlab - Lab for Developing and Testing Recommender Algorithms - R package
213 stars 61 forks source link

Predicting known ratings when manually testing performance/evaluating model #27

Closed MounirHader closed 5 years ago

MounirHader commented 5 years ago

Hi @mhahsler, I want to make predictions using my trained Recommender on a test set (a realRatingMatrix instance). The issue is that I can only make useful predictions on unknown data. I want to predict known ratings in this test set for manual performance testing, but when setting type="ratingMatrix" for returning the full matrix (instead of only the predictions for the unknown ratings) I get predictions that don't make sense. See below:

ratingmatrix <- as(movielense, "realRatingMatrix")
train <- ratingmatrix[fold, ]
test <- ratingmatrix[-fold, ]
# train SVD model
svd_model <- Recommender(data = train, method = "SVD")
pred_test <- as(predict(object = svd_model, newdata = test, type = "ratingMatrix"), "matrix")`

When my test dataset looks for example like this:

NA  NA  NA   3  NA  NA  4  NA  NA  NA 

My predictions become

3.952  3.951  3.948  -0.948  3.948  3.947  0.051  3.948  3.948  3.948

Note how all predictions seem fine except the two known entries (one even becomes negative...). Is there a way to predict these correctly so I can manually evaluate my model on the known ratings? The only alternative I found was setting type="ratings", which excludes the known ratings completely from the predictions.

Thank you!

mhahsler commented 5 years ago

Thanks for this bug report. It looks like the known values are replaced with the normalized values instead of the denormalized ones. I will instead use the approximated valued returned by the recommendation algorithm. This will be fixed in the development version on GitHub today and be rolled out in the next release.