mikemccand / luceneutil

Various utility scripts for running Lucene performance tests
Apache License 2.0
205 stars 115 forks source link

Enable indexing/searching on pre-computed .vec files like those produced by infer_token_vectors_cohere.py #272

Closed mikemccand closed 2 months ago

mikemccand commented 5 months ago

I added command-line options to Indexer and SearchPerfTest to take .vec file and dimension and index/search precomputed vectors e.g. as produced by the infer_token_vectors_cohere.py. I've also successfully run that tool to create the .vec files (I'll upload these to home.apache.org soon).

I'm currently trying to run a simple A/A benchmark that indexes and searches these vectors (test the end-to-end path of these changes) and if that works well, once we get this merged, I'll turn these on in nightlies ...

I'm not sure what to expect about these Cohere vectors vs the vectors the nightlies now use. How exactly are they different?