GpuIndexFlatL2 should support add_with_ids()

facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.

https://faiss.ai

MIT License

30.59k stars 3.57k forks source link

GpuIndexFlatL2 should support add_with_ids() #703

Open cjnolet opened 5 years ago

cjnolet commented 5 years ago

I find the IndexShards class to be very useful in the C++ API. It does seem to require, however, that the sub indices implement add_with_ids(). As a workaround, I am adding my data to my own vector of GpuIndexFlatL2 instances and calling the merge_tables reduction function myself.

This small change would enable me to use IndexShards

mdouze commented 5 years ago

Two possible workarounds:

encapsulate the GpuIndexFlatL2 in an IDMap
use the successive_ids option of IndexShards, then add will work (the comment in the code is incorrect).

wickedfoo commented 5 years ago

@cjnolet I think the case you're looking at here is to keep everything on the GPU and avoid unnecessary GPU/CPU synchronization?

I can look into adding support for labels for GpuIndexFlat.

Also, for the cuML use case, brute-force search functionality is now exposed as a function call, so you don't have to wrap data in an index (with the additional copy). I could add the ability to pass an optional index array pointer to this as well.

https://github.com/facebookresearch/faiss/blob/master/gpu/GpuDistance.h#L30