lucidrains / RETRO-pytorch

Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
Apache License 2.0
851 stars 105 forks source link

Scann vs faiss #28

Open afcruzs opened 2 years ago

afcruzs commented 2 years ago

Could you elaborate on the decision to use faiss instead of scann? In theory scann is open source too, but I'm wondering if you found easier to get the performance needed from faiss instead.

rom1504 commented 2 years ago

scann is opensource but not really packaged well + it doesn't quantize

but also it's not really a blocker either way in the current state of this repo doing more experiments with the LM and the LM+knn integration is

afcruzs commented 2 years ago

I agree is a bit cumbersome to use, but it should have (rather efficient) quantization, no? See last section of https://medium.com/@kumon/similarity-search-scann-and-4-bit-pq-ab98766b32bd

rom1504 commented 2 years ago

scann is fast but no it doesn't optimize for memory use, since it uses PQ4 it requires storing the embeddings at full precision for reranking to avoid loss of recall

marcobellagente93 commented 2 years ago

@rom1504 thanks for the answer! So is it correct to say that the open source version of scann does use quantization to compute faster inner products? I mean there's implemented options for using brute force and 2 different quantizers (lut16 and lut256), but I see your point about storing the embeddings. That's also odd since reorder (which I assume is what you mean by reranking) is optional

rom1504 commented 2 years ago

Yes it uses quantization to compute faster inner product. Do they report good results on pq4 and no reordering?

vinnik-dmitry07 commented 1 year ago

https://github.com/erikbern/ann-benchmarks/ https://ann-benchmarks.com/glove-100-angular_10_angular.html