mikemccand / luceneutil

Various utility scripts for running Lucene performance tests
Apache License 2.0
205 stars 115 forks source link

Can we re-enable strict checking for KNN queries? #201

Open mikemccand opened 2 years ago

mikemccand commented 2 years ago

A while back we discovered that KNN was producing non-deterministic results even on a deterministic index, and disabled strict top N hit checking for KNNVectorQuery.

We think/thought this was because the "deterministically random" source was seeded from each segment's GUID, but we've since fixed this query to use a fixed random seed (42 of course).

Can we now re-enable strict checking of the top KNN hits?

jpountz commented 2 years ago

I think there was an alternative proposal of tracking recall over time instead of checking that hits are strictly the same.

zhaih commented 2 years ago

I believe it's better to track recall overtime rather than strictly checking the hit, because KNN is an approximation itself so it is highly possible we'll able to improve recall by tuning the algorithm or implementation over the time.

Also it'd be nice if we can track recall over the time along with the performance.

rmuir commented 2 years ago

everything in search is an approximation: BM25, etc etc. There's absolutely no reason to give KNN some kind of free pass to leniency park. Leniency isn't going to help anything here.

msokolov commented 2 years ago

I think the idea is to track "recall" over time (ie how close the approximation is) so we can get an idea how the algorithm improves/degrades. We could also have strict checking, but sometimes we will change the results, and allow it to go forward anyway, and we still want to track recall. I posted a hinky chart in my ApacheCon talk: http://falutin.net/2022/10/11/apachecon-2022/ see slide #20 - it shows the recall as a blue line with arrows on top of the underlying QPS graph

On Sat, Oct 8, 2022 at 7:13 AM Robert Muir @.***> wrote:

everything in search is an approximation: BM25, etc etc. There's absolutely no reason to give KNN some kind of free pass to leniency park. Leniency isn't going to help anything here.

— Reply to this email directly, view it on GitHub https://github.com/mikemccand/luceneutil/issues/201#issuecomment-1272295116, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHHUQOS76YT6NWI7WUCM7LWCFJOJANCNFSM6AAAAAAQZVXVMM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

rmuir commented 2 years ago

we could track recall/precision/MAP for BM25 scoring too, but we don't. we are strict and it gives some confidence that scoring is working correctly: hasn't changed unless we intended it to. KNN isn't any different/special here.

mikemccand commented 2 years ago

It's OK for the KNN results to change -- that's just needs a nightly regolding as long as we know/accept the source/reason for the change.

rmuir commented 2 years ago

there's also no harm in tracking relevance, if its not difficult to do. I just think we should stay "strict"