apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.46k stars 979 forks source link

Explore moving HNSW's NeighborQueue to a radix heap [LUCENE-10383] #11419

Open asfimport opened 2 years ago

asfimport commented 2 years ago

Now that @jtibshirani improved merging via #11411, nightly benchmarks report the following two top CPU consumers:

18.45%        548638        org.apache.lucene.util.VectorUtil#dotProduct()
                              at org.apache.lucene.index.VectorSimilarityFunction$2#compare()
                              at org.apache.lucene.util.hnsw.HnswGraph#search()
                              at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNode()
13.23%        393609        org.apache.lucene.util.LongHeap#upHeap()
                              at org.apache.lucene.util.LongHeap#push()
                              at org.apache.lucene.util.hnsw.NeighborQueue#add()
                              at org.apache.lucene.util.hnsw.HnswGraph#search()

Exploration at #9028 suggested that radix heaps perform better than binary heaps if the heap is large: queries with 16 clauses or more got faster while queries with fewer clauses would get slower. With a default beam width of 100, I'd be interested in seeing how HNSW indexing performs if we replace the current binary LongHeap with a radix heap.


Migrated from LUCENE-10383 by Adrien Grand (@jpountz), updated Jan 22 2022

asfimport commented 2 years ago

Adrien Grand (@jpountz) (migrated from JIRA)

I mentioned a potential indexing speedup in the issue description because indexing is what I was looking at, but this could also help search a bit since search uses this NeighborQueue too, especially with large values of k.

benwtrent commented 1 year ago

Doing some minor digging into this. From what I can tell there are two things that use the heap logic:

A radixheap COULD work for the nearest neighbor tracking as it is monotonic. But some initial testing shows that our candidate tracking isn't monotonic. We will pop candidates and add candidates and their changes will not be monotonic. Once we start removing items from the min_heap, it is only for smaller items, or when we are removing to get the top_k.

angadp commented 5 months ago

Given the comment by @benwtrent is this issue still relevant?

benwtrent commented 5 months ago

@angadp I think it's more relevant given recent refactors. A radix heap may be possible now.

angadp commented 5 months ago

Thanks! I was looking for an open issue and will check this out.

angadp commented 5 months ago

I see what you were saying earlier now, Ben! So when we add the candidate we add them using outOfOrder method which is non-montonic and then try to find the least diverse candidate. Hence we don't use a priority queue for maintaining the candidates.

The nearest neighbor tracking is monotonic.

Beginner questions here:

What was the decision behind adding these candidates outOfOrder? Can we not try to modify the method to do it inOrder and then try to find the least diverse candidate to remove?

benwtrent commented 5 months ago

What was the decision behind adding these candidates outOfOrder?

Speed, once we know things are sorted, we know they have been checked for diversity. But with any optimization, measurement is important.

If you can get the same graph results but faster by adjusting how we track diverse candidates & using a radix heap, I say go forth and benchmark :)

angadp commented 3 months ago

Sorry for the late reply! I think my initial thinking was wrong.

I'm currently implementing this and will post the code along with the benchmarks by next week.