We have this awesome tool to measure Lucene's KNN performance (indexing time, index size, searching time, recall vs brute force, etc.).
It's been really helpful in uncovering issues (e.g. broken int8 quantization) and measuring the tradeoff of different quantization approaches.
It has limitations (e.g. cannot yet test simple 1-bit quantization nor better binary quantization (BBQ), it doesn't report "effective hot searching RAM", etc.) but it's already very useful so let's just run it in nightly benchy (lay the spider web) and make pretty charts and hey maybe we catch a fly at some point (e.g. if recall surprisingly drops)?
We have this awesome tool to measure Lucene's KNN performance (indexing time, index size, searching time, recall vs brute force, etc.).
It's been really helpful in uncovering issues (e.g. broken
int8
quantization) and measuring the tradeoff of different quantization approaches.It has limitations (e.g. cannot yet test simple 1-bit quantization nor better binary quantization (BBQ), it doesn't report "effective hot searching RAM", etc.) but it's already very useful so let's just run it in nightly benchy (lay the spider web) and make pretty charts and hey maybe we catch a fly at some point (e.g. if recall surprisingly drops)?