facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
30.46k stars 3.56k forks source link

pairwise_distances on GPU? #3008

Open SuperbTUM opened 1 year ago

SuperbTUM commented 1 year ago

Hi. I am using pairwise_distance Python API to calculate Jaccard distance and wonder how I can apply this on GPU. With faiss.pairwise_distances(embeddings, embeddings, faiss.METRIC_Jaccard), you only can run on CPU, correct me if I am wrong? Thanks.

SuperbTUM commented 1 year ago

Update: I try to leverage GPU-level API faiss.pairwise_distance_gpu(faiss.StandardGpuResources(), merged_embeddings.cpu().numpy(), merged_embeddings.cpu().numpy(), metric=faiss.METRIC_Jaccard, device=0) However, no GPU usage when I monitor with watch -n 0.5 nvidia-smi and the execution is rather slow, even failed ultimately. How can I address this?

alexriedel1 commented 1 year ago

I'm facing the same issue that there is no load on the gpu happening.

Also transferring everything to cpu, transferring it to gpu, then back to cpu and again to gpu (which will ineviatbely happen if you want to calculate on deep learning extracted vectors and use the results for other calculations) doesn't seem right to me..

embeddings = embeddings.cpu()
distances = faiss.pairwise_distance_gpu(faiss.StandardGpuResources(), embeddings , embeddings)
distances = distances.cuda()
alexriedel1 commented 1 year ago

One more comment on this, is faiss supposed to be faster for like 20,000 x 1 vectors than torch.cdist on gpu?