milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.49k stars 2.83k forks source link

[Bug]: [benchmark][cluster]Milvus 2 replicas concurrently with 2 clients, search, query, load latency becomes higher #18401

Closed jingkl closed 2 years ago

jingkl commented 2 years ago

Is there an existing issue for this?

Environment

- Milvus version:2.1.0-20220723-1e038a75
- Deployment mode(standalone or cluster): cluster
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus 2.1.0dev103
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

server-instance fouram-tag-no-clean-xrq2m-1 server-configmap server-cluster-8c64m-querynode2 client-configmap client-random-locust-search-filter-100m-ddl-r8-w2-rep2-36h

fouram-tag-no-clean-xrq2m-1-etcd-0                               1/1     Running   0             72m     10.104.6.42    4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-etcd-1                               1/1     Running   0             73m     10.104.5.6     4am-node12   <none>           <none>
fouram-tag-no-clean-xrq2m-1-etcd-2                               1/1     Running   0             74m     10.104.4.17    4am-node11   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-datacoord-586955f8cd-9gvlc    1/1     Running   0             42h     10.104.6.112   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-datanode-796458bcf5-r9m2x     1/1     Running   0             42h     10.104.6.111   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-indexcoord-5cf55b44c9-whgp2   1/1     Running   0             42h     10.104.1.91    4am-node10   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-indexnode-9d6cfc9c8-h47x6     1/1     Running   0             42h     10.104.4.47    4am-node11   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-proxy-dc78686c-rjknq          1/1     Running   0             42h     10.104.1.96    4am-node10   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-querycoord-865599ff4d-nlxl8   1/1     Running   0             42h     10.104.6.114   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-querynode-668fc4776-bh5gl     1/1     Running   0             42h     10.104.1.94    4am-node10   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-querynode-668fc4776-llbxp     1/1     Running   0             42h     10.104.9.225   4am-node14   <none>           <none>
fouram-tag-no-clean-xrq2m-1-milvus-rootcoord-5f8cfc9dbb-l7rcd    1/1     Running   0             42h     10.104.1.90    4am-node10   <none>           <none>
fouram-tag-no-clean-xrq2m-1-minio-0                              1/1     Running   0             42h     10.104.6.118   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-minio-1                              1/1     Running   0             42h     10.104.5.200   4am-node12   <none>           <none>
fouram-tag-no-clean-xrq2m-1-minio-2                              1/1     Running   0             42h     10.104.4.52    4am-node11   <none>           <none>
fouram-tag-no-clean-xrq2m-1-minio-3                              1/1     Running   0             42h     10.104.9.227   4am-node14   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-bookie-0                      1/1     Running   0             42h     10.104.6.120   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-bookie-1                      1/1     Running   0             42h     10.104.4.53    4am-node11   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-bookie-2                      1/1     Running   0             42h     10.104.5.205   4am-node12   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-broker-0                      1/1     Running   0             42h     10.104.4.48    4am-node11   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-proxy-0                       1/1     Running   0             42h     10.104.1.95    4am-node10   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-recovery-0                    1/1     Running   0             42h     10.104.6.113   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-zookeeper-0                   1/1     Running   0             42h     10.104.1.98    4am-node10   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-zookeeper-1                   1/1     Running   0             42h     10.104.6.123   4am-node13   <none>           <none>
fouram-tag-no-clean-xrq2m-1-pulsar-zookeeper-2                   1/1     Running   0             42h     10.104.4.58    4am-node11   <none>           <none>

search latency:

截屏2022-07-25 16 33 33

query:

截屏2022-07-25 16 33 48

load:

截屏2022-07-25 16 34 35

server-instance fouram-tag-no-clean-xrq2m-1 server-configmap server-cluster-8c64m-querynode2 client-configmap client-random-locust-search-filter-100m-ddl-r8-w2-replica2-con uses 2 clients search:

截屏2022-07-25 16 35 53

query:

截屏2022-07-25 16 36 01

load:

截屏2022-07-25 16 36 10

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

congqixia commented 2 years ago

Looks like this deployment has only ONE proxy. All the request of clients will be queued in proxy before dispatched to internal components. Could you please scale up number of proxies as well when number of clients increases? @jingkl

jingkl commented 2 years ago

client-random-locust-search-filter-100m-ddl-r8-w2-rep2-36h:

{
    "config.yaml": "locust_random_performance:
          collections:
            -
              collection_name: sift_100m_128_l2
              # collection_name: sift_10w_128_l2
              other_fields: float1
              ni_per: 50000
              build_index: true
              index_type: ivf_sq8
              index_param:
                nlist: 2048
              load_param:
                replica_number: 2
              task:
                types:
                  -
                    type: query
                    weight: 20
                    params:
                      top_k: 10
                      nq: 10
                      search_param:
                        nprobe: 16
                      filters:
                        -
                          range: \"{'range': {'float1': {'GT': -1.0, 'LT': collection_size * 0.5}}}\"
                  -
                    type: load
                    weight: 1
                    params:
                      replica_number: 2
                  -
                    type: get
                    weight: 10
                    params:
                      ids_length: 10
                  -
                    type: scene_test
                    weight: 2
                connection_num: 1
                clients_num: 20
                spawn_rate: 2
                # during_time: 100
                during_time: 36h
        "
}
jingkl commented 2 years ago

2.1.0-20220726-1b33c731 pymilvus 2.1.0dev103 The issue has now been fixed, so close the issue. If the problem occurs again later, the issue will be opened again.