Open syang1997 opened 1 month ago
Just now I backed up this vector to another cluster, and the exception can still be reproduced. After deleting the collection from the backup cluster, I backed up and restored it again, and it cannot be reproduced.
/assign @zhagnlu please help to take a look /unassign
This is not because scalar filtering causes hnsw to be unable to perform layer traversal, because there is another set of data that can also have data with scalars after multiple queries. And no data operation is performed during this period
This collection configures replicas. Is it caused by index differences between replicas?
what difference between upper and bottom search ?
what difference between upper and bottom search ?
@zhagnlu There is no difference, multiple requests return completely different results
what difference between upper and bottom search ?
@zhagnlu There is no difference, multiple requests return completely different results
if not hybrid search, just using query, will multiple requests return completely different results ?
what difference between upper and bottom search ?
@zhagnlu There is no difference, multiple requests return completely different results
if not hybrid search, just using query, will multiple requests return completely different results ?
Vector search and query returns normally
@zhagnlu Another phenomenon is that some search conditions cannot be returned at all if they have scalar filtering, but vector searches have returns. But this scalar filtering has data. hybrid search returns blank
okay, so the issue here is that if using query with expr filter on scalar fields, the results are not correct or consistent(not expected). But if query without fitlering, the results are always consistent(expected). Am I right? /assign @cydrain @liliu-z could you please also help to take a look
@yanliang567 Yes, sometimes the returned results are inconsistent, and sometimes the returned results are incorrect.Appears only on the hnsw index plus scalar filtering
Regenerated debug log milvus-log (2).tar.gz
Regenerated debug log milvus-log (2).tar.gz
@yanliang567 @cydrain @liliu-z Can you help us check together?
@syang Could you please tell us the filter_rate and index building parameters?
The current open source Milvus may have “less than top-k search results” problems with high filter_rates (70-90%) and low M
.
@syang Could you please tell us the filter_rate and index building parameters? The current open source Milvus may have “less than top-k search results” problems with high filter_rates (70-90%) and low
M
.
Most searches are normal, and now the M value is not small
@alwayslove2013 This collection has less than 20,000 data, but the M value and efConstruction are large enough (I think).I know about the data island problem that scalar filtering and hnsw work together, and I have previously investigated and adjusted the index construction parameters
Hi @syang1997 ,
Can you share your script to reproduce this issue ?
Hi @syang1997 ,
Can you share your script to reproduce this issue ?
I'm coding a demo to replicate this issue
Hi @syang1997 ,
One more question, I see you're using Milvus v2.3.15, have you tried Milvus v2.4.x ?
@syang1997 I think we need data to reproduce this issue. @cydrain please setup a meeting with syang see if we can get some data to reproduce
@syang1997 I think we need data to reproduce this issue. @cydrain please setup a meeting with syang see if we can get some data to reproduce
We have already communicated with the community once, and the preliminary reason is still the previous data island problem
@syang1997 I think we need data to reproduce this issue. @cydrain please setup a meeting with syang see if we can get some data to reproduce
@xiaofan-luan The phenomenon is that there is no return instead of returning insufficient topK, so it is suspected that the first layer node of HNSW is filtered by all
after discussion, it seems the reason might be hnsw filtered 70-80% data, cause graph connectivity brokes
This fix will be released with 2.4.7
/assign @yanliang567
This fix will be released with 2.4.7
Can it be merged to version 2.3.x? @yanliang567 @liliu-z
This fix will be released with 2.4.7
Can it be merged to version 2.3.x? @yanliang567 @liliu-z
I don't think so, as 2.3.20 could be the last release of 2.3.x
This fix will be released with 2.4.7
Can it be merged to version 2.3.x? @yanliang567 @liliu-z
I don't think so, as 2.3.20 could be the last release of 2.3.x
Okay, we will choose to upgrade to 2.4.x later
Is there an existing issue for this?
Environment
Current Behavior
Hybrid search cannot find out data, but a separate query can find out data This scalar query has data
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
milvus-log (1).tar.gz
Anything else?
No response