yahoojapan / NGT

Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Apache License 2.0
1.24k stars 114 forks source link

Is NGT::Index::search thread-safe? #117

Closed wbthomason closed 2 years ago

wbthomason commented 2 years ago

First off, thank you for NGT! It's quite impressive.

I'm trying to parallelize a large number of queries with OpenMP. My parallelized loop calls search() on a read-only index, then further processes the results. However, I seem to get nondeterministic results from this, and I'm wondering if the problem is that this method is not thread-safe.

I should also note that I've tried both with and without the shared memory allocator - I did not observe a difference in behavior.

masajiro commented 2 years ago

NGT::Index::search is thread-safe. There are cases where search returns different results for the same query, because search uses randomly select nodes from the leaf of the tree as seeds to explore the graph. However, the possibility is quite low. Also the possibility depends on the indexed dataset.

I would like to confirm that you don't find the issue when you don't use OpenMP. If so, could you provide your simplified search source code and your NGT index? I will check them.

wbthomason commented 2 years ago

Thanks! I have not been able to reproduce the issue without OpenMP, but it's possible that I'm missing something.

My simplified search code looks like:

struct Edge {
  std::size_t id;
  std::size_t neighborId;
  float distance;
};

#pragma omp declare reduction (merge: std::vector<Edge> : omp_out.insert(omp_out.end(), std::make_move_iterator(omp_in.begin()), std::make_move_iterator(omp_in.end())))

#pragma omp parallel default(none)                                             \
    shared(edges, points, numPoints, radius, indexPath)
{
  NGT::Index localIndex(indexPath, true);
#pragma omp for nowait reduction(merge : edges)
  for (std::size_t i = 0; i < points.size(); ++i) {
    const auto &point = points[i];
    NGT::SearchQuery q(point);
    NGT::ObjectDistances neighbors;
    q.setResults(&neighbors);
    // To ensure that we get everything within range
    q.setSize(numPoints);
    q.setRadius(radius);
// NOTE: Including this critical section removes the bug behavior in my testing - it just also removes the parallelism
// #pragma omp critical 
// {
    localIndex.search(q);
// }
    for (const auto &neighbor : neighbors) {
      // NGT indices start at 1; ours start at 0
      const auto neighborId = neighbor.id - 1;
      if (idx != neighborId) {
        edges.emplace_back(Edge{idx, neighborId, neighbor.distance});
      }
    }
  }
}

points is the entire set of nodes added to the NGT index; numPoints = points.size(), and radius is any number in [0.0, 1.0) - 0.33 typically works for exhibiting this behavior.

I can also provide generation code to produce points and the index, if this would be helpful.

The index: ngt_index.tar.gz

masajiro commented 2 years ago

Thank you for providing your code and index. However, I could not reproduce differences between parallelism and non-parallelism even by using points extracted from your index. It might depend on the environment. In addition, I am wondering if emplace_back is thread-safe, even though there is no problem on my environment. Apart from this issue, the opened NGT index placed outside of the parallel section can be shared.

wbthomason commented 2 years ago

Interesting! Thank you for trying to reproduce the issue. The emplace_back should be thread-safe since edges is a reduction variable (meaning that each thread should have a private copy), but perhaps something strange is happening there.