rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU
https://rapids.ai
Apache License 2.0
227 stars 68 forks source link

[FEA] Strongly filtered CAGRA #480

Open achirkin opened 3 days ago

achirkin commented 3 days ago

CAGRA has been observed to yield low recall when filtering is enabled, especially when the ratio of filtered-out values is high. This can be related in part to #208 and #472 , but there also may be fundamental reasons for the lower recall.

This feature request tracks the progress and suggestions to enable high-recall strongly filtered CAGRA.

As an experiment, I suggest to try the following tweaks, enabled by a boolean search parameter:

achirkin commented 3 days ago

Related: BFKNN as a strongly-filtered CAGRA replacement https://github.com/rapidsai/cuvs/issues/252