milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.58k stars 2.83k forks source link

[Bug]: RangeSearch with distances that hit normal search results but get empty results #32608

Open ThreadDao opened 5 months ago

ThreadDao commented 5 months ago

Is there an existing issue for this?

Environment

- Milvus version: master-20240424-dcc15e3e
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):   pulsar 
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

  1. create collection -> insert 12k-128d data -> flush -> index -> load

  2. normal search sp sp, _ := entity.NewIndexSCANNSearchParam(8, 20) with nq=1, limit=10. the distance of 10 results is:

    2024/04/25 16:24:47 search_test.go:1477: 38.364773
    2024/04/25 16:24:47 search_test.go:1477: 38.210598
    2024/04/25 16:24:47 search_test.go:1477: 38.128235
    2024/04/25 16:24:47 search_test.go:1477: 38.010742
    2024/04/25 16:24:47 search_test.go:1477: 37.96506
    2024/04/25 16:24:47 search_test.go:1477: 37.63387
    2024/04/25 16:24:47 search_test.go:1477: 37.603
    2024/04/25 16:24:47 search_test.go:1477: 37.5939
    2024/04/25 16:24:47 search_test.go:1477: 37.502533
    2024/04/25 16:24:47 search_test.go:1477: 37.456852
  3. range search with same queryVec and radius 0, rangeFiletr 50, but return empty ??

    // verify error nil, output all fields, range score
    sp.AddRadius(0)
    sp.AddRangeFilter(50)
    2024/04/25 16:24:47 milvus_client.go:14: (ApiRequest): func [Search], args: [context.Background.WithDeadline(2024-04-25 16:26:37.408042483 +0800 CST m=+120.004002689 [1m49.885803453s]) aPQK []  [*] nq=1 floatVec IP 10 0xc000120ac8 []]
    2024/04/25 16:24:47 milvus_client.go:21: (ApiResponse): func [Search], results: [[{0 <nil> 0xc00026dc20 [] [] extra output fields [int64 float floatVec json] found and result does not dynamic field}]]
    2024/04/25 16:24:47 search_test.go:1486: result count of nq[0] is 0
  4. go-case:

    func TestRangeSearchDebug(t *testing.T) {
    t.Parallel()
    for _, metricType := range []entity.MetricType{entity.IP} {
        ctx := createContext(t, time.Second*common.DefaultTimeout)
        // connect
        mc := createMilvusClient(ctx, t)
        // create collection
        cp := CollectionParams{CollectionFieldsType: Int64FloatVecJSON, AutoID: false, EnableDynamicField: true,
            ShardsNum: common.DefaultShards, Dim: common.DefaultDim}
        collName := createCollection(ctx, t, mc, cp)
        // insert
        dp := DataParams{CollectionName: collName, PartitionName: "", CollectionFieldsType: Int64FloatVecJSON,
            start: 0, nb: common.DefaultNb * 4, dim: common.DefaultDim, EnableDynamicField: true, WithRows: false}
        _, _ = insertData(ctx, t, mc, dp)
        mc.Flush(ctx, collName, false)
        // create scann index
        indexScann, _ := entity.NewIndexSCANN(metricType, 16, false)
        err := mc.CreateIndex(ctx, collName, common.DefaultFloatVecFieldName, indexScann, false)
        common.CheckErr(t, err, true)
        // describe index
        indexes, _ := mc.DescribeIndex(ctx, collName, common.DefaultFloatVecFieldName)
        expIndex := entity.NewGenericIndex(common.DefaultFloatVecFieldName, entity.SCANN, indexScann.Params())
        common.CheckIndexResult(t, indexes, expIndex)
        // load collection
        errLoad := mc.LoadCollection(ctx, collName, false)
        common.CheckErr(t, errLoad, true)
    
        // range search filter distance and output all fields
        queryVec := common.GenSearchVectors(1, common.DefaultDim, entity.FieldTypeFloatVector)
        sp, _ := entity.NewIndexSCANNSearchParam(8, 20)
    
        // search without distance range
        resSearch, errSearch := mc.Search(ctx, collName, []string{}, "", []string{"*"}, queryVec, common.DefaultFloatVecFieldName,
            metricType, common.DefaultTopK, sp)
        common.CheckErr(t, errSearch, true)
        for _, s := range resSearch[0].Scores {
            log.Println(s)
        }
    
        // verify error nil, output all fields, range score
        sp.AddRadius(0)
        sp.AddRangeFilter(50)
        resRange, errRange := mc.Search(ctx, collName, []string{}, "", []string{"*"}, queryVec, common.DefaultFloatVecFieldName,
            metricType, common.DefaultTopK, sp)
        common.CheckErr(t, errRange, true)
        log.Println(resRange[0].ResultCount)
        for _, s := range resRange[0].Scores {
            log.Println(s)
        }
    }
    }

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

collection: aPQK pods:

zong-go-master-clu-milvus-datanode-86c77cd544-8wqzm               1/1     Running                           0                 21h     10.104.5.220    4am-node12   <none>           <none>
zong-go-master-clu-milvus-datanode-86c77cd544-d79rm               1/1     Running                           0                 21h     10.104.15.207   4am-node20   <none>           <none>
zong-go-master-clu-milvus-indexnode-7bf946694c-gq78h              1/1     Running                           0                 22h     10.104.6.91     4am-node13   <none>           <none>
zong-go-master-clu-milvus-indexnode-7bf946694c-mbs2m              1/1     Running                           0                 22h     10.104.18.84    4am-node25   <none>           <none>
zong-go-master-clu-milvus-mixcoord-85b96949b9-7nrlm               1/1     Running                           0                 22h     10.104.15.190   4am-node20   <none>           <none>
zong-go-master-clu-milvus-proxy-574d95cdf7-lm2kh                  1/1     Running                           0                 21h     10.104.33.136   4am-node36   <none>           <none>
zong-go-master-clu-milvus-querynode-0-6cb7f8d996-dcwwd            1/1     Running                           0                 21h     10.104.4.185    4am-node11   <none>           <none>
zong-go-master-clu-milvus-querynode-0-6cb7f8d996-kfjp9            1/1     Running                           0                 21h     10.104.26.194   4am-node32   <none>           <none>
zong-go-master-clu-milvus-querynode-0-6cb7f8d996-s9zjw            1/1     Running                           0                 21h     10.104.18.91    4am-node25   <none>           <none>
zong-go-master-clu-milvus-querynode-0-6cb7f8d996-w8vwg            1/1     Running                           0                 22h     10.104.1.131    4am-node10   <none>           <none>

Anything else?

No response

yanliang567 commented 5 months ago

/assign @liliu-z /unassign

liliu-z commented 4 months ago

Try enlarge max_empty_result_buckets according to https://milvus.io/docs/index.md#SCANN

liliu-z commented 4 months ago

/assign @ThreadDao