milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.51k stars 2.83k forks source link

[Bug]: [benchmark][standalone] Milvus search without small tables, search slows down #20879

Closed elstic closed 1 year ago

elstic commented 1 year ago

Is there an existing issue for this?

Environment

- Milvus version:2.2.0-20221124-d2d72c16
- Deployment mode(standalone or cluster):standalone
- SDK version(e.g. pymilvus v2.0.0rc2): 2.2.0.dev72
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

server:

fouram-tag-no-clean-m9sd9-1-etcd-0                                1/1     Running     0             88m     10.104.6.45    4am-node13   <none>           <none>
fouram-tag-no-clean-m9sd9-1-milvus-standalone-5cc5dc9b84-fzq8h    1/1     Running     0             88m     10.104.9.141   4am-node14   <none>           <none>
fouram-tag-no-clean-m9sd9-1-minio-744d4c7b47-rdwz5                1/1     Running     0             88m     10.104.4.17    4am-node11   <none>           <none>

server-instance fouram-tag-no-clean-m9sd9-1 server-configmap server-single-32c128m client-configmap client-random-locust-100m-ddl-r8-w2-12h

grafana: image

image

client log:

[2022-11-28 16:22:26,709] [   DEBUG] - Milvus query run in 0.3231s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,709] [   DEBUG] - Milvus get run in 0.2024s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,709] [   DEBUG] - Milvus get run in 0.3235s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,710] [   DEBUG] - Milvus query run in 0.3236s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,710] [   DEBUG] - Milvus get run in 0.3236s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,789] [   DEBUG] - Milvus load_collection run in 0.2815s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,789] [   DEBUG] - Milvus query run in 0.232s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,870] [   DEBUG] - Milvus query run in 0.1597s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,978] [   DEBUG] - Milvus get run in 0.1873s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,978] [   DEBUG] - Milvus get run in 0.2677s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,978] [   DEBUG] - Milvus query run in 0.2678s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,979] [   DEBUG] - Milvus get run in 0.2681s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,979] [   DEBUG] - Milvus get run in 0.2685s (milvus_benchmark.client:57)
[2022-11-28 16:22:26,979] [   DEBUG] - Milvus get run in 0.2688s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,001] [   DEBUG] - Milvus get run in 0.1303s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,001] [   DEBUG] - Milvus get run in 0.2109s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,021] [   DEBUG] - Milvus get run in 0.0418s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,207] [   DEBUG] - Milvus get run in 0.2056s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,208] [   DEBUG] - Milvus query run in 0.228s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,208] [   DEBUG] - Milvus query run in 0.2282s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,209] [   DEBUG] - Milvus query run in 0.2283s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,209] [   DEBUG] - Milvus get run in 0.2287s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,209] [   DEBUG] - Milvus get run in 0.2287s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,329] [   DEBUG] - Milvus query run in 0.3272s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,330] [   DEBUG] - Milvus get run in 0.3076s (milvus_benchmark.client:57)
[2022-11-28 16:22:27,367] [   DEBUG] - Milvus get run in 0.1578s (milvus_benchmark.client:57)

normal situation:

[2022-11-25 03:51:33,247] [   DEBUG] - Milvus query run in 0.0545s (milvus_benchmark.client:57)
[2022-11-25 03:51:33,296] [   DEBUG] - Milvus query run in 0.048s (milvus_benchmark.client:57)
[2022-11-25 03:51:33,361] [   DEBUG] - Milvus query run in 0.0649s (milvus_benchmark.client:57)
[2022-11-25 03:51:33,412] [   DEBUG] - Milvus query run in 0.051s (milvus_benchmark.client:57)

Expected Behavior

No response

Steps To Reproduce

1. create a collection  
2. build index of ivf_sq8
3. insert 100m data
4. build index again
5. search ,query, load     ===>   increased latency, slower query

Milvus Log

No response

Anything else?

client-random-locust-100m-ddl-r8-w2-12h

    locust_random_performance:
      collections:
        -
          collection_name: sift_100m_128_l2
          ni_per: 50000
          build_index: true
          index_type: ivf_sq8
          index_param:
            nlist: 2048
          task:
            types:
              -
                type: query
                weight: 8
                params:
                  top_k: 10
                  nq: 10
                  search_param:
                    nprobe: 16
              -
                type: load
                weight: 1
              -
                type: get
                weight: 8
                params:
                  ids_length: 10
              # -
                # type: scene_test
                # weight: 2
            connection_num: 1
            clients_num: 20
            spawn_rate: 2
            # during_time: 1h
            during_time: 12h
yanliang567 commented 1 year ago

/assign @jiaoew1991 /unassign

jiaoew1991 commented 1 year ago

/assign @XuanYang-cn /unassign

xiaofan-luan commented 1 year ago

might want to start with a metrics, I'm doubting whether this is a knowhere issue.

XuanYang-cn commented 1 year ago

/assign @elstic Please provide the normal latency tests

elstic commented 1 year ago

@XuanYang-cn This is the normal delay of the same scene in 2.1.0: query: image load_collection: image search: image

elstic commented 1 year ago

After a few days, restarting the same test task, we had no improvement in latency @XuanYang-cn

Environment

- Milvus version: 2.2.0-20221208-368686e1
- Deployment mode(standalone or cluster):standalone
- SDK version(e.g. pymilvus v2.0.0rc2):   2.2.1.dev4
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

server:

fouram-tag-no-clean-m9sd9-1-etcd-0                                1/1     Running     0             112s    10.104.6.154   4am-node13   <none>           <none>
fouram-tag-no-clean-m9sd9-1-milvus-standalone-6df89f485f-cgvs4    1/1     Running     0             112s    10.104.1.126   4am-node10   <none>           <none>
fouram-tag-no-clean-m9sd9-1-minio-744d4c7b47-vvk55                1/1     Running     0             112s    10.104.1.127   4am-node10   <none>           <none>

server-instance fouram-tag-no-clean-m9sd9-1 server-configmap server-single-32c128m client-configmap client-random-locust-100m-ddl-r8-w2-12h

grafana: image

image

image

client log:

[2022-12-08 10:07:19,296] [   DEBUG] - Milvus get run in 1.3134s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,296] [   DEBUG] - Milvus get run in 1.3134s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,296] [   DEBUG] - Milvus get run in 1.3132s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,355] [   DEBUG] - Milvus load_collection run in 0.3625s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,440] [   DEBUG] - Milvus query run in 0.4472s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,573] [   DEBUG] - Milvus query run in 0.5798s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,649] [   DEBUG] - Milvus query run in 0.6559s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,740] [   DEBUG] - Milvus query run in 0.7471s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,741] [   DEBUG] - Milvus get run in 1.1754s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,741] [   DEBUG] - Milvus get run in 1.1756s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,741] [   DEBUG] - Milvus get run in 1.1761s (milvus_benchmark.client:57)
[2022-12-08 10:07:19,741] [   DEBUG] - Milvus get run in 1.176s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,241] [   DEBUG] - Milvus query run in 0.9442s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,242] [   DEBUG] - Milvus get run in 1.2486s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,242] [   DEBUG] - Milvus get run in 0.9454s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,242] [   DEBUG] - Milvus get run in 1.2489s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,242] [   DEBUG] - Milvus get run in 1.2489s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,362] [   DEBUG] - Milvus load_collection run in 0.6206s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,665] [   DEBUG] - Milvus query run in 0.9228s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,758] [   DEBUG] - Milvus query run in 1.016s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,851] [   DEBUG] - Milvus query run in 1.1087s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,893] [   DEBUG] - Milvus query run in 1.1505s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,952] [   DEBUG] - Milvus query run in 1.21s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,953] [   DEBUG] - Milvus get run in 1.6561s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,953] [   DEBUG] - Milvus get run in 1.6561s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,953] [   DEBUG] - Milvus get run in 1.6562s (milvus_benchmark.client:57)
[2022-12-08 10:07:20,953] [   DEBUG] - Milvus get run in 1.6565s (milvus_benchmark.client:57)

Expected Behavior

No response

Steps To Reproduce

1. create a collection  
2. build index of ivf_sq8
3. insert 100m data
4. build index again
5. search ,query, load     ===>   increased latency, slower query

Milvus Log

No response

Anything else?

client-random-locust-100m-ddl-r8-w2-12h

    locust_random_performance:
      collections:
        -
          collection_name: sift_100m_128_l2
          ni_per: 50000
          build_index: true
          index_type: ivf_sq8
          index_param:
            nlist: 2048
          task:
            types:
              -
                type: query
                weight: 8
                params:
                  top_k: 10
                  nq: 10
                  search_param:
                    nprobe: 16
              -
                type: load
                weight: 1
              -
                type: get
                weight: 8
                params:
                  ids_length: 10
              # -
                # type: scene_test
                # weight: 2
            connection_num: 1
            clients_num: 20
            spawn_rate: 2
            # during_time: 1h
            during_time: 12h
elstic commented 1 year ago

Not present in 2.2.1

milvus: v2.2.1 pymilvus : 2.2.1dev4 argo task: fouram-upgrade-server-c6lpk

Latency: query: image search: image

scence test: image