facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
30.2k stars 3.54k forks source link

faiss hnsw *memset* high cpu usage #3220

Open shaozhixue opened 7 months ago

shaozhixue commented 7 months ago

Summary

index type: hnsw32

faiss version: 1.7.4

data set: 500w 768d

Running on: CPU

search parameter: topk=100 ef is default qps: 800

my questions:

  1. why memset cpu usage is so high? As shown below: image
  2. avg rt:56ms, Shouldn't HNSW's performance be better?

Looking forward to your reply,thanks you very much

mdouze commented 7 months ago

Would you mind checking if this occurs with an official compile of Faiss?

shaozhixue commented 7 months ago

Would you mind checking if this occurs with an official compile of Faiss? sorry, how to get an official compile of Faiss? conda install? my compile environment: system: centos 7 3.10.0-957.21.3.el7.x86_64 gcc:4.8.5 cmake version 3.23.3 The detailed information is as shown in the following image. image multithread single query search ,not batch querys

shaozhixue commented 7 months ago

@mdouze use gperftool to test cpu performance: image

alexanderguzhva commented 6 months ago

@shaozhixue, this is related to use of table of visited points. Basically, HNSW tracks the points of the graph that were already visited. And it stores such info in a std::vector<uint8_t>, which is supposed to be cleared from time to time. Check https://github.com/facebookresearch/faiss/blob/c1822a81b22a70a6a38848be2c18b44244db57b5/faiss/impl/AuxIndexStructures.h#L165 and the memset right there.

Here is where the search uses it: https://github.com/facebookresearch/faiss/blob/c1822a81b22a70a6a38848be2c18b44244db57b5/faiss/impl/HNSW.cpp#L626

idk why it is implemented this way instead of std::vector<bool>, @mdouze do you know by chance?

shaozhixue commented 6 months ago

@shaozhixue, this is related to use of table of visited points. Basically, HNSW tracks the points of the graph that were already visited. And it stores such info in a std::vector<uint8_t>, which is supposed to be cleared from time to time. Check

https://github.com/facebookresearch/faiss/blob/c1822a81b22a70a6a38848be2c18b44244db57b5/faiss/impl/AuxIndexStructures.h#L165

and the memset right there. Here is where the search uses it:

https://github.com/facebookresearch/faiss/blob/c1822a81b22a70a6a38848be2c18b44244db57b5/faiss/impl/HNSW.cpp#L626

idk why it is implemented this way instead of std::vector<bool>, @mdouze do you know by chance?

@alexanderguzhva thank you for your reply, i found the OpenMP cause the problem. when i set OMP_NUM_THREADS = 1 and OMP_WAIT_POLICY=PASSIVE, it disappeared. but i don't know why