Closed tanmoyio closed 5 months ago
You are searching vectors 1 by 1, which is both inefficient because the transfer overhead to GPU is more than search time and also makes it impossible to parallelize because this is performed by splitting batches. Also note that setting faiss_index.nprobe on an IndexProxy will not set the nprobe, see
Summary
While running multigpu search its only utilizing one gpu compute, but I can see the VRAM usage for other gpus
Running on:
Interface:
Reproduction instructions