rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU
https://rapids.ai
Apache License 2.0
233 stars 66 forks source link

[QST] Is there a two_pass_precision mode for brute force knn? #333

Open phact opened 2 months ago

phact commented 2 months ago

My project uses RAFT brute force knn and I noticed a drop in precision when I upgraded to the latest RAFT. I moved to cuvs but still see cosine similarity that's off by more than 1e-04 when compared to the dot product cpu calculation. Is this a bad use case for cuvs?

I had this issue before when I was using cuML and two_pass_precission fixed it, unfortunately it also suffered from a different correctness bug https://github.com/rapidsai/cuml/issues/5569. Would appreciate any suggestions.

cjnolet commented 2 months ago

Thanks for creating an issue about this @phact. can you share a little more info about how you are using this? Are you using the Python API? What precision is your data? (float or double?). If it's not too hard to provide a trivial reproducible example then that would be helpful.

Sometimes this can be caused my multiple sources of small precision errors throughout the computational stack- inner product can mount small errors, then follow-on arithmetic can make things slightly worse. We will work to get this fixed if you can help us understand more.

phact commented 2 months ago

Thanks for the reply @cjnolet

Yes, it's the python API using float32 (I checked and looks like double isn't supported).

I'll take a stab at a minimum reproducible example. Is there a plan to have two modes (one fast/lossy and one for high precision)?