harsha-simhadri / big-ann-benchmarks

Framework for evaluating ANNS algorithms on billion scale datasets.
https://big-ann-benchmarks.com
MIT License
313 stars 103 forks source link

Zilliz sparse solution #269

Closed hhy3 closed 5 months ago

hhy3 commented 5 months ago

Expected result:~8k qps for private queries with 90% recall sparse-full_private2

Our solution is based on HNSW graph with some optimizations for sparse vectors. First, data and indices are quantized to 8bit and 16bit respectively. Second, some low-importance elements are pruned to decrease computational complexity. Finally, highly efficient SIMD code for sparse vector inner product are written to accelerate computation.

ingberam commented 5 months ago

code builds and running now. Note that we will not evaluate the hidden queries, but only the public ones. Will update with results.

hhy3 commented 5 months ago

@ingberam Thanks!

ingberam commented 5 months ago

results on the public query set:

zilliz,zilliz_qdrop0.12_ef45,sparse-full,10,10224.100656060828,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.8991547277936963
zilliz,zilliz_qdrop0.11_ef45,sparse-full,10,9933.14687744289,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9015902578796562
zilliz,zilliz_qdrop0.12_ef50,sparse-full,10,9420.170672415596,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9065186246418339
zilliz,zilliz_qdrop0.12_ef55,sparse-full,10,8880.665577270343,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9127793696275072
zilliz,zilliz_qdrop0.11_ef55,sparse-full,10,8717.7258822545,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9146848137535816
zilliz,zilliz_qdrop0.12_ef60,sparse-full,10,8306.030821940156,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9173495702005731
zilliz,zilliz_qdrop0.12_ef65,sparse-full,10,7955.247596747603,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9215616045845272
zilliz,zilliz_qdrop0.11_ef65,sparse-full,10,7816.38584582656,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9232091690544413
zilliz,zilliz_qdrop0.12_ef70,sparse-full,10,7562.167138630923,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9253008595988538
zilliz,zilliz_qdrop0.11_ef70,sparse-full,10,7432.666935949251,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9268194842406876
hhy3 commented 5 months ago

results on the public query set:

zilliz,zilliz_qdrop0.12_ef45,sparse-full,10,10224.100656060828,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.8991547277936963
zilliz,zilliz_qdrop0.11_ef45,sparse-full,10,9933.14687744289,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9015902578796562
zilliz,zilliz_qdrop0.12_ef50,sparse-full,10,9420.170672415596,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9065186246418339
zilliz,zilliz_qdrop0.12_ef55,sparse-full,10,8880.665577270343,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9127793696275072
zilliz,zilliz_qdrop0.11_ef55,sparse-full,10,8717.7258822545,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9146848137535816
zilliz,zilliz_qdrop0.12_ef60,sparse-full,10,8306.030821940156,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9173495702005731
zilliz,zilliz_qdrop0.12_ef65,sparse-full,10,7955.247596747603,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9215616045845272
zilliz,zilliz_qdrop0.11_ef65,sparse-full,10,7816.38584582656,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9232091690544413
zilliz,zilliz_qdrop0.12_ef70,sparse-full,10,7562.167138630923,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9253008595988538
zilliz,zilliz_qdrop0.11_ef70,sparse-full,10,7432.666935949251,0.0,23708.367114067078,13843116.0,0.0,0.0,sparse,0.9268194842406876

Look good. Thanks

ingberam commented 5 months ago

@hhy3 please fix the conflicts in .github/workflows/neurips23.yml so that we can merge

hhy3 commented 5 months ago

@hhy3 please fix the conflicts in .github/workflows/neurips23.yml so that we can merge

Fixed