use RAFT for select_k on the gpu

benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets

https://benfred.github.io/implicit/

MIT License

3.57k stars 612 forks source link

use RAFT for select_k on the gpu #656

Closed benfred closed 1 year ago

benfred commented 1 year ago

This changes to use RAFT https://github.com/rapidsai/raft for GPU top-k code instead of faiss. The RAFT version is quite a bit faster, and also doesn't have the same performance issues with small batch sizes that faiss has (meaning we can delete a bunch of code that was trying to work around that). RAFT also doesn't have limitations on the size of K, where faiss is limited to k less than 2048.

benfred commented 1 year ago

Benchmarking this change on a dataset of github stars - containing 9M items with 96 dimensional embeddings, shows a good improvement in queries per second:

dataset	batch_size	Previous QPS	QPS with RAFT	% improvement
github	1	161.73	185.26	14.5%
github	1000	2299.46	2774.98	21%