mikemccand / luceneutil

Various utility scripts for running Lucene performance tests
Apache License 2.0
205 stars 115 forks source link

Add binary (single bit) KNN quantization option to `knnPerfTest.py` #317

Open mikemccand opened 2 weeks ago

mikemccand commented 2 weeks ago

Currently knnPerfTest.py can test int4 and int7 KNN quantization (and float32, no quantization). Lucene also supports "simple binary quantization", though it's not so turnkey because the user must pre-quantize their vectors to single bit per dimension in byte[] vector form.

OpenSearch has such a binary quantizer to run outside of Lucene -- maybe we can poach that here? Or maybe improve Lucene so it can do this quantizing under the hood?

Lucene now also (soon?) has RabitQ inspired "better binary quantizer" which we should also add to knnPerfTest! So hard to keep up...