milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.26k stars 2.9k forks source link

[Bug]: [benchmark][cluster] deploy milvus panic, init Proxy server failed: context deadline exceeded #37372

Open wangting0128 opened 11 hours ago

wangting0128 commented 11 hours ago

Is there an existing issue for this?

Environment

- Milvus version:2.5-20241101-47337ea8-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-jwltv test case name: test_concurrent_locust_diskann_dml_dql_filter_cluster

server:

NAME                                                              READY   STATUS             RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
streaming-node-corn-1730430000-1-etcd-0                           1/1     Running            0               30m     10.104.19.234   4am-node28   <none>           <none>
streaming-node-corn-1730430000-1-etcd-1                           1/1     Running            0               30m     10.104.16.138   4am-node21   <none>           <none>
streaming-node-corn-1730430000-1-etcd-2                           1/1     Running            0               30m     10.104.26.29    4am-node32   <none>           <none>
streaming-node-corn-1730430000-1-milvus-datanode-8b98f4c897hbgb   0/1     CrashLoopBackOff   10 (2m1s ago)   30m     10.104.26.27    4am-node32   <none>           <none>
streaming-node-corn-1730430000-1-milvus-indexnode-5b44fd8fk55tz   0/1     CrashLoopBackOff   10 (75s ago)    30m     10.104.9.115    4am-node14   <none>           <none>
streaming-node-corn-1730430000-1-milvus-mixcoord-979c5cb5-f8psv   0/1     CrashLoopBackOff   10 (3m ago)     30m     10.104.15.121   4am-node20   <none>           <none>
streaming-node-corn-1730430000-1-milvus-proxy-68c8cbf5d5-bkk9l    0/1     CrashLoopBackOff   10 (2m3s ago)   30m     10.104.15.122   4am-node20   <none>           <none>
streaming-node-corn-1730430000-1-milvus-querynode-6696fd4bntqlc   0/1     CrashLoopBackOff   10 (99s ago)    30m     10.104.15.120   4am-node20   <none>           <none>
streaming-node-corn-1730430000-1-minio-0                          1/1     Running            0               30m     10.104.30.250   4am-node38   <none>           <none>
streaming-node-corn-1730430000-1-minio-1                          1/1     Running            0               30m     10.104.19.236   4am-node28   <none>           <none>
streaming-node-corn-1730430000-1-minio-2                          1/1     Running            0               30m     10.104.16.139   4am-node21   <none>           <none>
streaming-node-corn-1730430000-1-minio-3                          1/1     Running            0               30m     10.104.34.237   4am-node37   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-bookie-0                  1/1     Running            0               30m     10.104.15.125   4am-node20   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-bookie-1                  1/1     Running            0               30m     10.104.24.144   4am-node29   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-bookie-2                  1/1     Running            0               30m     10.104.16.144   4am-node21   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-bookie-init-m4zmk         0/1     Completed          0               30m     10.104.18.7     4am-node25   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-broker-0                  1/1     Running            0               30m     10.104.18.8     4am-node25   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-proxy-0                   1/1     Running            0               30m     10.104.30.247   4am-node38   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-pulsar-init-vr5nx         0/1     Completed          0               30m     10.104.18.9     4am-node25   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-recovery-0                1/1     Running            0               30m     10.104.30.248   4am-node38   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-zookeeper-0               1/1     Running            0               30m     10.104.19.237   4am-node28   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-zookeeper-1               1/1     Running            0               29m     10.104.15.143   4am-node20   <none>           <none>
streaming-node-corn-1730430000-1-pulsar-zookeeper-2               1/1     Running            0               28m     10.104.32.24    4am-node39   <none>           <none>

kubectl describe pod streaming-node-corn-1730430000-1-milvus-proxy-68c8cbf5d5-bkk9l -n qa-milvus image

kubectl logs streaming-node-corn-1730430000-1-milvus-proxy-68c8cbf5d5-bkk9l -n qa-milvus image

截屏2024-11-01 20 13 58

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

server config:

{
     "queryNode": {
          "resources": {
               "limits": {
                    "cpu": "4.0",
                    "memory": "4Gi"
               },
               "requests": {
                    "cpu": "3.0",
                    "memory": "3Gi"
               }
          },
          "replicas": 1
     },
     "indexNode": {
          "resources": {
               "limits": {
                    "cpu": "2.0",
                    "memory": "4Gi"
               },
               "requests": {
                    "cpu": "2.0",
                    "memory": "3Gi"
               }
          },
          "replicas": 1
     },
     "dataNode": {
          "resources": {
               "limits": {
                    "cpu": "2.0",
                    "memory": "2Gi"
               },
               "requests": {
                    "cpu": "2.0",
                    "memory": "2Gi"
               }
          }
     },
     "cluster": {
          "enabled": true
     },
     "pulsar": {},
     "kafka": {},
     "minio": {
          "metrics": {
               "podMonitor": {
                    "enabled": true
               }
          }
     },
     "etcd": {
          "metrics": {
               "enabled": true,
               "podMonitor": {
                    "enabled": true
               }
          }
     },
     "metrics": {
          "serviceMonitor": {
               "enabled": true
          }
     },
     "log": {
          "level": "debug"
     },
     "image": {
          "all": {
               "repository": "harbor.milvus.io/milvus/milvus",
               "tag": "2.5-20241101-47337ea8-amd64"
          }
     }
}
xiaofan-luan commented 4 hours ago

this seems to be an network issue, not a milvus bug