recommenders-team / recommenders

Best Practices on Recommendation Systems
https://recommenders-team.github.io/recommenders/intro.html
MIT License
18.86k stars 3.07k forks source link

[FEATURE] Implement dot product Matrix Factorization #1713

Open miguelgfierro opened 2 years ago

miguelgfierro commented 2 years ago

Description

MF with CPU: https://arxiv.org/abs/2005.09683 https://github.com/google-research/google-research/tree/master/dot_vs_learned_similarity

MF with PyTorch: https://www.ethanrosenthal.com/2017/06/20/matrix-factorization-in-pytorch/

MF with PySpark: ????

Expected behavior with the suggested feature

Other Comments

@fazamani

The code I found for CPU doesn't seem very efficient, it seems it makes the dot product of 1 user and item embedding at a time, if we are able to multiple several ones or a batch, we can use num_expr instead of numpy (see this benchmark for details).

For the GPU version, there is an implementation with pytorch, but also we can implement it with numba (see benchmark above).

I haven't had time to find a pyspark version, but there should be a way.

A note on this method. To me this can become a fundamental method (like SAR) or others that can provide a lot of value. In this case, the key is in the embeddings that we generate. A lot of newer deep learning methods put a lot of emphasis in the network structure and use simple user and item embeddings. But maybe it is more effective to spend more time in creating rich user and item embeddings and then perform the dot product.

miguelgfierro commented 2 years ago

FYI @anargyri @simonzhaoms @pradnyeshjoshi