dselivanov / rsparse

Fast and accurate machine learning on sparse matrices - matrix factorizations, regression, classification, top-N recommendations.
https://www.slideshare.net/DmitriySelivanov/matrix-factorizations-for-recommender-systems
170 stars 31 forks source link

Huge performance degradataion for WRMF #72

Open david-cortes opened 2 years ago

david-cortes commented 2 years ago

In version 0.5.0 from CRAN (installed with a modified Makevars.in to force OMP linkage), there is a huge slowdown in WRMF with implicit feedback compared to earlier versions.

For example, if I try running it on the LastFM-360K dataset with this configuration + 15 iterations with no early stopping:

WRMF$new(feedback="implicit", rank=50, lambda=5,
         solver="conjugate_gradient",
         with_global_bias=FALSE, with_user_item_bias=FALSE)

And then compare different libraries with these same settings, I get the following times:

Whereas in earlier versions the time was somewhere between implicit and cmfrec. The Cholesky solver is also affected by this slowdown.

I haven't been able to pinpoint what is causing the slowdown. Tried adding extra armadillo defines like DARMA_DONT_USE_WRAPPER, DARMA_USE_BLAS, DARMA_USE_LAPACK, DARMA_USE_OPENMP, but it didn't make a difference.

david-cortes commented 2 years ago

Actually it's not related to the version. Tried downgrading to 0.4.0 and got the same timings. Perhaps something to do with newer armadillo versions?

EDIT: it actually isn't. Tried with versions of RcppArmadillo and OpenBLAS from 2020 and still experienced the problem. Perhaps something to do with newer GCC versions? This is BTW on an AMD Ryzen 7 2700 (3.2Ghz 8c/16t), GCC11.2.0 (flags -O3 -march=native -fno-math-erro -fno-trapping-math and using link-time optimization), and OpenBLAS 0.3.19 (OpenMP variant).