milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.32k stars 2.91k forks source link

[Bug]: [benchmark][cluster] Write options raise error `rpc deadline exceeded: Retry timeout: 30s` under dql and dml scene #26949

Closed wangting0128 closed 10 months ago

wangting0128 commented 1 year ago

Is there an existing issue for this?

Environment

- Milvus version:2.3.0-20230907-264c542b
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar   
- SDK version(e.g. pymilvus v2.0.0rc2):2.4.0.dev81
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-stable-test-wt-lgfrx client pod: fouramf-stable-test-wt-lgfrx-528320546

server:

[2023-09-08 09:14:23,571 -  INFO - fouram]: [Base] Deploy initial state: 
I0908 04:13:03.572885     385 request.go:665] Waited for 1.167736601s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/actions.summerwind.dev/v1alpha1?timeout=32s
NAME                                                              READY   STATUS      RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-stable-lgfrx-4-62-6377-etcd-0                             1/1     Running     0               8m15s   10.104.9.46     4am-node14   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-etcd-1                             1/1     Running     0               8m15s   10.104.1.188    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-etcd-2                             1/1     Running     0               8m15s   10.104.14.223   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-datacoord-5c56d97975fppmn   1/1     Running     0               8m16s   10.104.1.172    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-datanode-6f89986c48-zr8vq   1/1     Running     1 (3m45s ago)   8m15s   10.104.6.43     4am-node13   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-indexcoord-75b678957ftlhn   1/1     Running     0               8m16s   10.104.1.175    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-indexnode-5d5f799fc8gzq66   1/1     Running     0               8m16s   10.104.9.38     4am-node14   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-proxy-7f99f49c58-dkbsz      1/1     Running     1 (3m45s ago)   8m16s   10.104.4.190    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-querycoord-57bcb8d94gxps6   1/1     Running     1 (3m45s ago)   8m15s   10.104.4.191    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-querynode-666844c54dpwc9w   1/1     Running     0               8m16s   10.104.18.219   4am-node25   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-rootcoord-77c44877bc4qrlz   1/1     Running     1 (3m44s ago)   8m15s   10.104.4.192    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-0                            1/1     Running     0               8m15s   10.104.24.6     4am-node29   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-1                            1/1     Running     0               8m15s   10.104.4.208    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-2                            1/1     Running     0               8m15s   10.104.9.47     4am-node14   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-3                            1/1     Running     0               8m15s   10.104.1.189    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-0                    1/1     Running     0               8m15s   10.104.24.5     4am-node29   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-1                    1/1     Running     0               8m15s   10.104.4.209    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-2                    1/1     Running     0               8m15s   10.104.14.224   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-init-j4b89           0/1     Completed   0               8m16s   10.104.1.174    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-broker-0                    1/1     Running     0               8m15s   10.104.1.177    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-proxy-0                     1/1     Running     0               8m15s   10.104.14.209   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-pulsar-init-vxj97           0/1     Completed   0               8m16s   10.104.4.189    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-recovery-0                  1/1     Running     0               8m16s   10.104.4.188    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-zookeeper-0                 1/1     Running     0               8m15s   10.104.14.219   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-zookeeper-1                 1/1     Running     0               6m43s   10.104.6.61     4am-node13   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-zookeeper-2                 1/1     Running     0               5m17s   10.104.17.213   4am-node23   <none>           <none> (base.py:221)
[2023-09-08 09:14:23,572 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'STATUS|fouramf-stable-lgfrx-4-62-6377-milvus|fouramf-stable-lgfrx-4-62-6377-minio|fouramf-stable-lgfrx-4-62-6377-etcd|fouramf-stable-lgfrx-4-62-6377-pulsar|fouramf-stable-lgfrx-4-62-6377-kafka'  (util_cmd.py:14)
[2023-09-08 09:14:32,601 -  INFO - fouram]: [CliClient] pod details of release(fouramf-stable-lgfrx-4-62-6377): 
 I0908 09:14:24.845441     499 request.go:665] Waited for 1.166172517s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/autoscaling/v2beta1?timeout=32s
NAME                                                              READY   STATUS      RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-stable-lgfrx-4-62-6377-etcd-0                             1/1     Running     0               5h9m    10.104.9.46     4am-node14   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-etcd-1                             1/1     Running     0               5h9m    10.104.1.188    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-etcd-2                             1/1     Running     0               5h9m    10.104.14.223   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-datacoord-5c56d97975fppmn   1/1     Running     0               5h9m    10.104.1.172    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-datanode-6f89986c48-zr8vq   1/1     Running     1 (5h5m ago)    5h9m    10.104.6.43     4am-node13   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-indexcoord-75b678957ftlhn   1/1     Running     0               5h9m    10.104.1.175    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-indexnode-5d5f799fc8gzq66   1/1     Running     0               5h9m    10.104.9.38     4am-node14   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-proxy-7f99f49c58-dkbsz      1/1     Running     1 (5h5m ago)    5h9m    10.104.4.190    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-querycoord-57bcb8d94gxps6   1/1     Running     1 (5h5m ago)    5h9m    10.104.4.191    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-querynode-666844c54dpwc9w   1/1     Running     0               5h9m    10.104.18.219   4am-node25   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-milvus-rootcoord-77c44877bc4qrlz   1/1     Running     1 (5h5m ago)    5h9m    10.104.4.192    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-0                            1/1     Running     0               5h9m    10.104.24.6     4am-node29   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-1                            1/1     Running     0               5h9m    10.104.4.208    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-2                            1/1     Running     0               5h9m    10.104.9.47     4am-node14   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-minio-3                            1/1     Running     0               5h9m    10.104.1.189    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-0                    1/1     Running     0               5h9m    10.104.24.5     4am-node29   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-1                    1/1     Running     0               5h9m    10.104.4.209    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-2                    1/1     Running     0               5h9m    10.104.14.224   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-bookie-init-j4b89           0/1     Completed   0               5h9m    10.104.1.174    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-broker-0                    1/1     Running     0               5h9m    10.104.1.177    4am-node10   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-proxy-0                     1/1     Running     0               5h9m    10.104.14.209   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-pulsar-init-vxj97           0/1     Completed   0               5h9m    10.104.4.189    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-recovery-0                  1/1     Running     0               5h9m    10.104.4.188    4am-node11   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-zookeeper-0                 1/1     Running     0               5h9m    10.104.14.219   4am-node18   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-zookeeper-1                 1/1     Running     0               5h8m    10.104.6.61     4am-node13   <none>           <none>
fouramf-stable-lgfrx-4-62-6377-pulsar-zookeeper-2                 1/1     Running     0               5h6m    10.104.17.213   4am-node23   <none>           <none>

client error:

[2023-09-08 08:56:40,783 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2023-09-08 08:56:40,784 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2023-09-08 08:56:40,784 -  INFO - fouram]: grpc     delete                                                                        297810     0(0.00%) |    248       3   29861     87 |    0.10        0.00 (stats.py:789)
[2023-09-08 08:56:40,784 -  INFO - fouram]: grpc     insert                                                                        297223     0(0.00%) |    258      11   29877     95 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:56:40,784 -  INFO - fouram]: grpc     load                                                                          597119     0(0.00%) |    105       2    4488    110 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:56:40,784 -  INFO - fouram]: grpc     query                                                                        1489207     0(0.00%) |     55       1   19828     53 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:56:40,784 -  INFO - fouram]: grpc     search                                                                       2980673     0(0.00%) |     11       3   23422      7 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:56:40,785 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2023-09-08 08:56:40,785 -  INFO - fouram]:          Aggregated                                                                   5662032     0(0.00%) |     58       1   29877     17 |    0.10        0.00 (stats.py:789)
[2023-09-08 08:56:40,785 -  INFO - fouram]:  (stats.py:790)
[2023-09-08 08:56:40,787 -  INFO - fouram]: Response time percentiles (approximated) (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: grpc     delete                                                                                 87    110    130    170    400   1100   2300   3200   8200  16000  30000 297810 (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: grpc     insert                                                                                 95    110    140    180    410   1100   2300   3300   8400  16000  30000 297223 (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: grpc     load                                                                                  110    120    130    140    160    180    200    230    420    780   4500 597119 (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: grpc     query                                                                                  53     65     71     76     87     96    110    120    280   6100  20000 1489207 (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: grpc     search                                                                                  7      9     11     12     18     26     34     41    180   5500  23000 2980673 (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2023-09-08 08:56:40,787 -  INFO - fouram]:          Aggregated                                                                             17     49     67     78    110    140    210    470   3500   9600  30000 5662032 (stats.py:819)
[2023-09-08 08:56:40,788 -  INFO - fouram]:  (stats.py:820)
[2023-09-08 08:56:43,906 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:13.903269', 'RPC error': '2023-09-08 08:56:43.906293'}> (decorators.py:108)
[2023-09-08 08:56:43,909 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:43,909 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:43,978 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:13.976813', 'RPC error': '2023-09-08 08:56:43.978821'}> (decorators.py:108)
[2023-09-08 08:56:43,980 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:43,980 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,021 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.020642', 'RPC error': '2023-09-08 08:56:44.021551'}> (decorators.py:108)
[2023-09-08 08:56:44,021 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,022 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,146 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.144666', 'RPC error': '2023-09-08 08:56:44.146573'}> (decorators.py:108)
[2023-09-08 08:56:44,147 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,147 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,391 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.390089', 'RPC error': '2023-09-08 08:56:44.391427'}> (decorators.py:108)
[2023-09-08 08:56:44,391 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,391 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,491 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.490551', 'RPC error': '2023-09-08 08:56:44.491280'}> (decorators.py:108)
[2023-09-08 08:56:44,491 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,491 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,671 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.669791', 'RPC error': '2023-09-08 08:56:44.671405'}> (decorators.py:108)
[2023-09-08 08:56:44,671 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,672 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,809 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.807978', 'RPC error': '2023-09-08 08:56:44.809348'}> (decorators.py:108)
[2023-09-08 08:56:44,810 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,810 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:44,889 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:14.888052', 'RPC error': '2023-09-08 08:56:44.889358'}> (decorators.py:108)
[2023-09-08 08:56:44,889 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:44,890 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:46,346 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.345663', 'RPC error': '2023-09-08 08:56:46.346709'}> (decorators.py:108)
[2023-09-08 08:56:46,347 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:46,347 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:56:46,356 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.355238', 'RPC error': '2023-09-08 08:56:46.356213'}> (decorators.py:108)
[2023-09-08 08:56:46,356 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:56:46,356 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:00,789 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2023-09-08 08:57:00,790 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2023-09-08 08:57:00,790 -  INFO - fouram]: grpc     delete                                                                        297815     5(0.00%) |    248       3   30004     87 |    0.10        0.10 (stats.py:789)
[2023-09-08 08:57:00,790 -  INFO - fouram]: grpc     insert                                                                        297229     6(0.00%) |    258      11   30007     95 |    0.10        0.10 (stats.py:789)
[2023-09-08 08:57:00,790 -  INFO - fouram]: grpc     load                                                                          597119     0(0.00%) |    105       2    4488    110 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:57:00,791 -  INFO - fouram]: grpc     query                                                                        1489207     0(0.00%) |     55       1   19828     53 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:57:00,791 -  INFO - fouram]: grpc     search                                                                       2980673     0(0.00%) |     11       3   23422      7 |    0.00        0.00 (stats.py:789)
[2023-09-08 08:57:00,791 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2023-09-08 08:57:00,791 -  INFO - fouram]:          Aggregated                                                                   5662043    11(0.00%) |     58       1   30007     17 |    0.20        0.20 (stats.py:789)
[2023-09-08 08:57:00,791 -  INFO - fouram]:  (stats.py:790)
[2023-09-08 08:57:00,793 -  INFO - fouram]: Response time percentiles (approximated) (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: grpc     delete                                                                                 87    110    130    170    400   1100   2300   3200   8200  17000  30000 297815 (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: grpc     insert                                                                                 95    110    140    180    410   1100   2300   3300   8400  18000  30000 297229 (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: grpc     load                                                                                  110    120    130    140    160    180    200    230    420    780   4500 597119 (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: grpc     query                                                                                  53     65     71     76     87     96    110    120    280   6100  20000 1489207 (stats.py:819)
[2023-09-08 08:57:00,793 -  INFO - fouram]: grpc     search                                                                                  7      9     11     12     18     26     34     41    180   5500  23000 2980673 (stats.py:819)
[2023-09-08 08:57:00,794 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2023-09-08 08:57:00,794 -  INFO - fouram]:          Aggregated                                                                             17     49     67     78    110    140    210    470   3600   9600  30000 5662043 (stats.py:819)
[2023-09-08 08:57:00,794 -  INFO - fouram]:  (stats.py:820)
[2023-09-08 08:57:05,181 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.356384', 'RPC error': '2023-09-08 08:57:05.181359'}> (decorators.py:108)
[2023-09-08 08:57:05,183 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:57:05,184 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:05,202 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.368613', 'RPC error': '2023-09-08 08:57:05.202878'}> (decorators.py:108)
[2023-09-08 08:57:05,203 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:57:05,203 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:05,205 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.380100', 'RPC error': '2023-09-08 08:57:05.205836'}> (decorators.py:108)
[2023-09-08 08:57:05,207 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:57:05,207 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:05,214 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.423775', 'RPC error': '2023-09-08 08:57:05.214002'}> (decorators.py:108)
[2023-09-08 08:57:05,214 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:57:05,214 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:05,219 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:16.833804', 'RPC error': '2023-09-08 08:57:05.219312'}> (decorators.py:108)
[2023-09-08 08:57:05,219 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:57:05,219 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:05,221 - ERROR - fouram]: RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2023-09-08 08:56:17.880823', 'RPC error': '2023-09-08 08:57:05.221441'}> (decorators.py:108)
[2023-09-08 08:57:05,222 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (api_request.py:53)
[2023-09-08 08:57:05,222 - ERROR - fouram]: [CheckFunc] delete request check failed, response:<MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)> (func_check.py:46)
[2023-09-08 08:57:20,797 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: grpc     delete                                                                        298217     8(0.00%) |    249       3   48827     87 |   26.20        0.00 (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: grpc     insert                                                                        297608     9(0.00%) |    259      11   48839     94 |   25.30        0.00 (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: grpc     load                                                                          597874     0(0.00%) |    105       2    4488    110 |   46.20        0.00 (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: grpc     query                                                                        1491039     0(0.00%) |     55       1   21335     53 |  111.00        0.00 (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: grpc     search                                                                       2984573     0(0.00%) |     11       3   34595      7 |  240.40        0.00 (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]:          Aggregated                                                                   5669311    17(0.00%) |     58       1   48839     17 |  449.10        0.00 (stats.py:789)
[2023-09-08 08:57:20,798 -  INFO - fouram]:  (stats.py:790)
[2023-09-08 08:57:20,800 -  INFO - fouram]: Response time percentiles (approximated) (stats.py:819)
[2023-09-08 08:57:20,800 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: grpc     delete                                                                                 87    110    130    170    400   1100   2300   3200   8300  18000  49000 298217 (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: grpc     insert                                                                                 94    110    140    180    410   1100   2300   3300   8500  18000  49000 297608 (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: grpc     load                                                                                  110    120    130    140    160    180    200    230    420    780   4500 597874 (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: grpc     query                                                                                  53     65     71     76     87     96    110    120    280   6200  21000 1491039 (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: grpc     search                                                                                  7      9     11     12     18     26     34     41    180   5500  35000 2984573 (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]:          Aggregated                                                                             17     49     67     78    110    140    210    470   3600   9700  49000 5669311 (stats.py:819)
[2023-09-08 08:57:20,801 -  INFO - fouram]:  (stats.py:820)

test result:

{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_2c2m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': '8.0',
                                                              'memory': '4Gi'},
                                                   'requests': {'cpu': '5.0',
                                                                'memory': '3Gi'}},
                                     'replicas': 1},
                       'indexNode': {'resources': {'limits': {'cpu': '2.0',
                                                              'memory': '4Gi'},
                                                   'requests': {'cpu': '2.0',
                                                                'memory': '3Gi'}},
                                     'replicas': 1},
                       'dataNode': {'resources': {'limits': {'cpu': '2.0',
                                                             'memory': '2Gi'},
                                                  'requests': {'cpu': '2.0',
                                                               'memory': '2Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsar': {'bookkeeper': {'volumes': {'journal': {'storageClassName': 'local-path'},
                                                             'ledgers': {'storageClassName': 'local-path'}}},
                                  'zookeeper': {'volumes': {'data': {'storageClassName': 'local-path'}}}},
                       'kafka': {'persistence': {'storageClass': 'local-path'}},
                       'minio': {'persistence': {'storageClass': 'local-path'},
                                 'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'global': {'storageClass': 'local-path'},
                                'metrics': {'enabled': True,
                                            'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/dockerhub/milvusdb/milvus',
                                         'tag': '2.3.0-20230907-264c542b'}}},
            'host': 'fouramf-stable-lgfrx-4-62-6377-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_concurrent_locust_hnsw_dml_dql_filter_cluster',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 128,
                                                    'dataset_name': 'sift',
                                                    'dataset_size': 100000,
                                                    'ni_per': 50000},
                                 'collection_params': {'other_fields': ['float_1'],
                                                       'shards_num': 2},
                                 'load_params': {},
                                 'query_params': {},
                                 'search_params': {},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False,
                                                          'reset_db': False},
                                 'index_params': {'index_type': 'HNSW',
                                                  'index_param': {'M': 8,
                                                                  'efConstruction': 200}},
                                 'concurrent_params': {'concurrent_number': 20,
                                                       'during_time': '5h',
                                                       'interval': 20,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 10,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'ef': 16},
                                                                  'expr': {'float_1': {'GT': -1.0,
                                                                                       'LT': 50000.0}},
                                                                  'guarantee_timestamp': None,
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'timeout': 60,
                                                                  'random_data': True}},
                                                      {'type': 'query',
                                                       'weight': 5,
                                                       'params': {'ids': [0,
                                                                          1,
                                                                          2,
                                                                          3,
                                                                          4,
                                                                          5,
                                                                          6,
                                                                          7,
                                                                          8,
                                                                          9],
                                                                  'expr': None,
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'timeout': 60}},
                                                      {'type': 'load',
                                                       'weight': 2,
                                                       'params': {'replica_number': 1,
                                                                  'timeout': 30}},
                                                      {'type': 'delete',
                                                       'weight': 1,
                                                       'params': {'delete_length': 1,
                                                                  'timeout': 30}},
                                                      {'type': 'insert',
                                                       'weight': 1,
                                                       'params': {'nb': 1,
                                                                  'timeout': 30,
                                                                  'random_id': True,
                                                                  'random_vector': True,
                                                                  'varchar_filled': False}}]},
            'run_id': 20230908240,
            'datetime': '2023-09-08 04:04:53.169478',
            'client_version': '2.3.0'},
 'result': {'test_result': {'index': {'RT': 27.0502},
                            'insert': {'total_time': 3.9818,
                                       'VPS': 25114.2699,
                                       'batch_time': 1.9909,
                                       'batch': 50000},
                            'flush': {'RT': 2.9804},
                            'load': {'RT': 4.5591},
                            'Locust': {'Aggregated': {'Requests': 6103241,
                                                      'Fails': 17,
                                                      'RPS': 339.07,
                                                      'fail_s': 0.0,
                                                      'RT_max': 48839.24,
                                                      'RT_avg': 57.65,
                                                      'TP50': 17,
                                                      'TP99': 440.0},
                                       'delete': {'Requests': 321091,
                                                  'Fails': 8,
                                                  'RPS': 17.84,
                                                  'fail_s': 0.0,
                                                  'RT_max': 48827.34,
                                                  'RT_avg': 239.15,
                                                  'TP50': 86,
                                                  'TP99': 3100.0},
                                       'insert': {'Requests': 320538,
                                                  'Fails': 9,
                                                  'RPS': 17.81,
                                                  'fail_s': 0.0,
                                                  'RT_max': 48839.24,
                                                  'RT_avg': 248.78,
                                                  'TP50': 93,
                                                  'TP99': 3200.0},
                                       'load': {'Requests': 643494,
                                                'Fails': 0,
                                                'RPS': 35.75,
                                                'fail_s': 0.0,
                                                'RT_max': 4488.13,
                                                'RT_avg': 106.51,
                                                'TP50': 110.0,
                                                'TP99': 230.0},
                                       'query': {'Requests': 1605725,
                                                 'Fails': 0,
                                                 'RPS': 89.21,
                                                 'fail_s': 0.0,
                                                 'RT_max': 21335.54,
                                                 'RT_avg': 55.88,
                                                 'TP50': 54,
                                                 'TP99': 120.0},
                                       'search': {'Requests': 3212393,
                                                  'Fails': 0,
                                                  'RPS': 178.47,
                                                  'fail_s': 0.0,
                                                  'RT_max': 34595.91,
                                                  'RT_avg': 11.53,
                                                  'TP50': 7,
                                                  'TP99': 40}}}}}

Expected Behavior

No response

Steps To Reproduce

1. deploy a Cluster Milvus
2. prepare 100m data
3. concurrent request: delete、insert、load、query, search 《- delete and insert options raise error

Milvus Log

Milvus log:

截屏2023-09-08 21 41 25

Anything else?

No response

xiaofan-luan commented 1 year ago

seems to be a pulsar timeout

yanliang567 commented 1 year ago

@elstic let's see if it reproduces stably. /assign @elstic /unassign

elstic commented 1 year ago

@elstic let's see if it reproduces stably. /assign @elstic /unassign

this issue has not occurred recently

stale[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

elstic commented 10 months ago

This issue has not arisen recently