recommenders-team / recommenders

Best Practices on Recommendation Systems
https://recommenders-team.github.io/recommenders/intro.html
MIT License
18.47k stars 3.04k forks source link

[FEATURE] Approximate Nearest Neighbours for Recommender Systems #822

Open miguelgfierro opened 5 years ago

miguelgfierro commented 5 years ago

Description

Add KNN for topk

Here there is a benchmark of KNN libs: https://www.benfrederickson.com/approximate-nearest-neighbours-for-recommender-systems/

Expected behavior with the suggested feature

Other Comments

loomlike commented 4 years ago

@gramhagen I can add item, user knns If you're not working on this

gramhagen commented 4 years ago

yeah go for it, I haven't had much time to work on anything honestly. were you thinking of doing approximate or exact knn?

miguelgfierro commented 4 years ago

In the link I added there is a comparison between different libraries for knn and a paper by Noam is cited. Maybe it would be good to ask him for some advice on what library to use

loomlike commented 4 years ago

@miguelgfierro Checked a couple of options and probably faiss may be the best fit in terms or performance (support gpu), ease of install (conda installable) + license (MIT). Will check with Noam as well.

loomlike commented 4 years ago

@gramhagen approximate. RAPIDS also has knn based on faiss working on multi-gpu. That could be another option. Btw, faiss doesn't support sparse vector. For large-size data, we should consider other options...

gramhagen commented 4 years ago

rapids sounds good. Also, @mdekstrand mentioned the knn implementation in Python LensKit is significantly optimized to allow it to scale to fairly large datasets, so that's probably a good place to start.