Closed navneet1v closed 3 weeks ago
Checking why CIs are failing
Checking why CIs are failing
CIs were failing because the code was skipping some docs during live docs counting. Now the logic is fixed. Thanks @jmazanec15 for help in debugging.
LGTM
waiting on CIs to complete.
Description
Fix the force merge with Quantization failures when a segment has deleted docs in it.
Issue:
When documents are deleted from the segment and segments are force merged, the FlatVectorValues are getting merged to 1 vectorValues but that
vectorvalue.getLiveDocs()
doesn't provide correct live docs but it provides thecost
which includes the deleted docs too.Due to this when we are doing quantization, we are hitting the vectors which are not present and leading to NPE.
With this change, we are ensuring that we are iterating over the vector values to find the correct live docs and then pass it to Quantization and NativeIndexWriter.
Testing
I added an IT that simulate this buggy nature, but due to some race conditions it doesn't work 100% of the time. I will follow up on fixing the IT and make it more strict.
Manual Testing
Currently I tested manually by below flow, which always create exceptions without this fix.
Index Data
Index data again to simulate deletes
Do force merge
check logs to see the errors.
Do Search it fails without this fix
Issue
https://github.com/opensearch-project/k-NN/issues/1949
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.