mhahsler / recommenderlab

recommenderlab - Lab for Developing and Testing Recommender Algorithms - R package
213 stars 61 forks source link

Evaluation Scheme doesn't work for Large Real Rating Matrix #59

Closed DanielRauser closed 2 years ago

DanielRauser commented 2 years ago

Hey there,

it's me again :). Now I'm having a problem with the Evaluation Scheme. I have some matrices and the evaluation works fine for the binary matrix (Size 158 MB). Now I also wanted to use it for my Real Rating Matrices (Size ~ 196MB) and the code runs endlessly and nothing happens. I also tried it on a more powerful computer and it ran for two hours and nothing (not even a error message). I tried split, cross-validation and it just wouldn't work. For reference: The binary matrix which is identic (with exception of the ratings numeric value) to the real rating matrices took only 10 minutes to calculate.

That's my code


getRatings(rrm)

normalize(rrm)

evaluation_scheme_rrm <- evaluationScheme(rrm, 
                                                   method = "cross-validation",
                                                   train = 0.8,
                                                   k = 5,
                                                   given  = -1,
                                                   goodRating = 1)

My specs:

Core i7 1165G7 64 GB RAM 512 GB SSD Iris Xe Graphics

I really would appreciate help here. You can find the matrix attached.

Best regards,

Daniel RealRatingMatrix.zip

mhahsler commented 2 years ago

It looks like splitting the data takes very long. I need to look at the code if there is potential to improve the code to make it not take forever.

mhahsler commented 2 years ago

I have looked into the code and improved the runtime. The evaluation scheme builds now on my computer in less than a minute.

> library(recommenderlab)
> load("~/Downloads/rrm.RData")
> rrm <- td_rrm_aa_1_matrix
> object.size(rrm)
195634032 bytes
> system.time(evaluation_scheme_rrm <- evaluationScheme(rrm,
+   method = "cross-validation",
+   k = 4,
+   given  = -1,
+   goodRating = 1)
+ )
   user  system elapsed 
 41.652   0.492  42.121 
DanielRauser commented 2 years ago

Hi,

wow thank you very much. This really helps me and speeds up my evaluation process! Have a great start into the new week