mikemccand / luceneutil

Various utility scripts for running Lucene performance tests
Apache License 2.0
205 stars 115 forks source link

Updating vector benchmarking utilities #214

Closed benwtrent closed 1 year ago

benwtrent commented 1 year ago

These are some updates I needed to make to build the vectors for testing and then run knnPerfTest

benwtrent commented 1 year ago

@msokolov ping, as you originally added these facilities.

msokolov commented 1 year ago

It's sort of annoying that this class is in test - it's really more of a utility, and having it in test means it isn't tested! And I have definitely created bugs in it at various times - it's pretty complex. I wonder if we should move it to the benchmark package? Although everything in there is kind of its own thing - or maybe into luceneutil?

mikemccand commented 1 year ago

having it in test means it isn't tested!

Who tests the tester!

benwtrent commented 1 year ago

maybe into luceneutil?

@msokolov it does seem like KnnGraphTester should be within luceneutil. The only thing that utilizes KnnGraphTester is luceneutil and it is the defacto nightly performance test suite for Lucene.

If we were to put it into benchmarks, we should also add a benchmark that utilizes it.

KnnGraphTester can also give us recall information vs. bruteforce NN, which would catch regression in quality, not just performance.

Maybe I just talked myself out of putting into luceneutil... 🤦

msokolov commented 1 year ago

I think we can probably close this now since KnnGraphTester was moved, although now, looking again, I see there are some changes to enable computing byte vectors that we would want to keep. If you have time, do you mind updating this?

benwtrent commented 1 year ago

If you have time, do you mind updating this?

@msokolov you got it dude! I will see about resolving and updating this week.

benwtrent commented 1 year ago

@msokolov updated