castorini / anserini

Anserini is a Lucene toolkit for reproducible information retrieval research
http://anserini.io/
Apache License 2.0
1.01k stars 444 forks source link

Initial implementation of flat vector search #2510

Closed lintool closed 3 months ago

lintool commented 3 months ago

The following works:

python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.flat
python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat
codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 86.09023% with 37 lines in your changes are missing coverage. Please review.

Project coverage is 67.07%. Comparing base (d721fc8) to head (d4a8588). Report is 1 commits behind head on master.

Files Patch % Lines
...ain/java/io/anserini/search/FlatDenseSearcher.java 75.30% 17 Missing and 3 partials :warning:
...nserini/index/codecs/AnseriniFlatVectorFormat.java 75.60% 8 Missing and 2 partials :warning:
...ava/io/anserini/search/SearchFlatDenseVectors.java 94.18% 4 Missing and 1 partial :warning:
...c/main/java/io/anserini/index/AbstractIndexer.java 0.00% 1 Missing :warning:
.../java/io/anserini/index/IndexFlatDenseVectors.java 98.21% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #2510 +/- ## ============================================ + Coverage 66.66% 67.07% +0.41% - Complexity 1432 1469 +37 ============================================ Files 214 218 +4 Lines 12319 12585 +266 Branches 1507 1523 +16 ============================================ + Hits 8213 8442 +229 - Misses 3587 3618 +31 - Partials 519 525 +6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.