criteo / autofaiss

Automatically create Faiss knn indices with the most optimal similarity search parameters.
https://criteo.github.io/autofaiss/
Apache License 2.0
804 stars 74 forks source link

Exact distances #171

Open ahafftka opened 1 year ago

ahafftka commented 1 year ago

If I want to compute exact distances, which index keys would work? I tried OPQ256_768,IVF256_HNSW32,PQ256x8,RFlat with inner product, but it appears to produce non-exact distances. When I use the same index_key without autofaiss (e.g., just using faiss), it does appear to produce exact scores. I am using inner product metrics. I am using spark distributed mode. Thanks.

rom1504 commented 1 year ago

You need a flat index. For example IndexFlatIP

No need to use autofaiss in this case.

On Sat, Jun 10, 2023, 19:19 Ariel Hafftka @.***> wrote:

If I want to compute exact distances, which index keys would work? I tried OPQ256_768,IVF256_HNSW32,PQ256x8,RFlat with inner product, but it appears to produce non-exact distances. When I use the same index_key without autofaiss (e.g., just using faiss), it does appear to produce exact scores. I am using inner product metrics. Thanks.

— Reply to this email directly, view it on GitHub https://github.com/criteo/autofaiss/issues/171, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437RUJAFOAOZQETC25ELXKSUA7ANCNFSM6AAAAAAZBZLPAQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

rom1504 commented 1 year ago

I think HNSW may allow that, otherwise faiss has some options for reranking

But anyway functionally what you need is to store a full copy of the embeddings and to recompute the distances after search You can do it with faiss or manually

On Sat, Jun 10, 2023, 23:02 Ariel Hafftka @.***> wrote:

Thanks. To clarify, I am referring to exact dot-product computations (zero reconstruction error in the computed similarity scores), however it is still okay for ANN to have imperfect recall.

— Reply to this email directly, view it on GitHub https://github.com/criteo/autofaiss/issues/171#issuecomment-1585815318, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437UFRMULT7YETG6SHM3XKTOGLANCNFSM6AAAAAAZBZLPAQ . You are receiving this because you commented.Message ID: @.***>