NicolasHug / Surprise

A Python scikit for building and analyzing recommender systems
http://surpriselib.com
BSD 3-Clause "New" or "Revised" License
6.34k stars 1.01k forks source link

Significantly improve performance of SVD and SVDpp #400

Closed ProfHercules closed 2 years ago

ProfHercules commented 2 years ago

I made some small changes, mostly to the matrix_factorization.pyx file, that drastically improve performance of the SVD and SVDpp algorithms. Most of the speed-up is evident in the SVDpp algorithm.

Most of the improvement comes from rewriting some loop constructs to avoid Python code as much as possible. This mostly involved changing for _ in to while constructs, please see matrix_factorization.pyx for more context.

On my machine (M1 MacBook Air) the benchmarks produce the following results:

Original (current) version

Movielens 100k RMSE MAE Time Time (new) Change
SVD 0.936 0.738 0:00:11 0:00:02 5.5x
SVD++ 0.922 0.723 0:06:54 0:00:31 13.35x
NMF 0.964 0.758 0:00:11 0:00:04 2.75x
Slope One 0.946 0.743 0:00:06 0:00:06
k-NN 0.98 0.774 0:00:06 0:00:06
Centered k-NN 0.951 0.749 0:00:06 0:00:06
k-NN Baseline 0.931 0.733 0:00:07 0:00:07
Co-Clustering 0.965 0.755 0:00:02 0:00:02
Baseline 0.944 0.748 0:00:00 0:00:00
Random 1.523 1.222 0:00:00 0:00:00
I haven't spent the time to run the 1M benchmark for the original code, but for the update it produces: Movielens 1M RMSE MAE Time 1M / 100k
SVD 0.874 0.686 0:00:24 12x
SVD++ 0.862 0.672 0:10:39 20.61x
NMF 0.917 0.725 0:00:44 11x
Slope One 0.907 0.715 0:02:15
k-NN 0.923 0.727 0:04:44
Centered k-NN 0.929 0.738 0:04:48
k-NN Baseline 0.895 0.706 0:04:54
Co-Clustering 0.916 0.718 0:00:32
Baseline 0.909 0.719 0:00:10
Random 1.506 1.207 0:00:10

Using the above table, along with the original benchmarks, we can determine that when switching from the 100k dataset to the 1M dataset the:

Honestly, I'm not exactly sure why NMF is faster... Both the original code and the new code were compiled from source before running the benchmarks.

The difference with SVDpp might need to be investigated, but considering a base speed-up of 13.35x the reduced performance as the dataset size increases might be a worthwhile tradeoff.

Note: I haven't written any new tests, since fundamentally the algorithm remains unchanged. All tests that passed before my changes still pass after my changes (roughly 2x as fast).

ProfHercules commented 2 years ago

Closing for now, want to improve the commit history to make changes easier to follow. Will open a new pull request when done.