milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.96k stars 2.87k forks source link

[Bug]: When the query node is rolling upgraded, with 3 replicas, the first two query nodes are upgraded quickly, but the third query node waits for 30 minutes before completing the upgrade. #36426

Open zhuwenxing opened 3 weeks ago

zhuwenxing commented 3 weeks ago

Is there an existing issue for this?

Environment

- Milvus version:2.4.3--> master-20240920-eb23e23c-amd64
- Deployment mode(standalone or cluster):mixcoord
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

image

It can be inferred from the age of the query node that there was a 30-minute gap between the second query node being upgraded to the third query node being upgraded.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/rolling_update_for_operator_test_simple/detail/rolling_update_for_operator_test_simple/4975/pipeline log: artifacts-kafka-mixcoord-4975-server-logs.tar.gz

cluster: 4am ns: chaos-testing pod info before upgrading

[2024-09-22T17:06:39.621Z] + kubectl get pods -o wide

[2024-09-22T17:06:39.623Z] + grep kafka-mixcoord-4975

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-etcd-0                                        1/1     Running       0                3m8s    10.104.23.130   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-etcd-1                                        1/1     Running       0                3m8s    10.104.18.18    4am-node25   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-etcd-2                                        1/1     Running       0                3m8s    10.104.34.116   4am-node37   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-0                                       2/2     Running       0                3m7s    10.104.23.133   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-1                                       2/2     Running       0                3m7s    10.104.34.118   4am-node37   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-2                                       2/2     Running       0                3m7s    10.104.18.21    4am-node25   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-exporter-578868bb6-qp47p                1/1     Running       3 (2m46s ago)    3m7s    10.104.14.121   4am-node18   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-zookeeper-0                             1/1     Running       0                3m7s    10.104.23.132   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-zookeeper-1                             1/1     Running       0                3m7s    10.104.25.150   4am-node30   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-kafka-zookeeper-2                             1/1     Running       0                3m7s    10.104.30.205   4am-node38   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-datanode-c989b6bc-5fk95                1/1     Running       0                117s    10.104.14.125   4am-node18   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-datanode-c989b6bc-9frjz                1/1     Running       0                117s    10.104.6.167    4am-node13   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-datanode-c989b6bc-mnkzp                1/1     Running       0                117s    10.104.23.137   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-indexnode-7df544f557-6h6rp             1/1     Running       0                117s    10.104.14.126   4am-node18   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-indexnode-7df544f557-7pd74             1/1     Running       0                117s    10.104.6.166    4am-node13   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-indexnode-7df544f557-dtlh4             1/1     Running       0                117s    10.104.23.136   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-mixcoord-5d5cdcf7bf-5g24n              1/1     Running       0                117s    10.104.6.168    4am-node13   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-proxy-9b74d69f5-qs4ml                  1/1     Running       0                117s    10.104.6.165    4am-node13   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-querynode-0-dfc85597d-ct4qq            1/1     Running       0                116s    10.104.14.127   4am-node18   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-querynode-0-dfc85597d-dqhbq            1/1     Running       0                116s    10.104.6.169    4am-node13   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-milvus-querynode-0-dfc85597d-t4x69            1/1     Running       0                116s    10.104.23.138   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-minio-0                                       1/1     Running       0                3m8s    10.104.23.131   4am-node27   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-minio-1                                       1/1     Running       0                3m8s    10.104.34.117   4am-node37   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-minio-2                                       1/1     Running       0                3m8s    10.104.25.151   4am-node30   <none>           <none>

[2024-09-22T17:06:39.878Z] kafka-mixcoord-4975-minio-3                                       1/1     Running       0                3m8s    10.104.18.19    4am-node25   <none>           <none>

after upgrading

[2024-09-22T18:06:13.678Z] + kubectl get pods -o wide

[2024-09-22T18:06:13.678Z] + grep kafka-mixcoord-4975

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-etcd-0                                       1/1     Running       0                62m     10.104.23.130   4am-node27   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-etcd-1                                       1/1     Running       0                62m     10.104.18.18    4am-node25   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-etcd-2                                       1/1     Running       0                62m     10.104.34.116   4am-node37   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-0                                      2/2     Running       0                62m     10.104.23.133   4am-node27   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-1                                      2/2     Running       0                62m     10.104.34.118   4am-node37   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-2                                      2/2     Running       0                62m     10.104.18.21    4am-node25   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-exporter-578868bb6-qp47p               1/1     Running       3 (62m ago)      62m     10.104.14.121   4am-node18   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-zookeeper-0                            1/1     Running       0                62m     10.104.23.132   4am-node27   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-zookeeper-1                            1/1     Running       0                62m     10.104.25.150   4am-node30   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-kafka-zookeeper-2                            1/1     Running       0                62m     10.104.30.205   4am-node38   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-datanode-787849f54f-59w8m             1/1     Running       0                8m42s   10.104.5.129    4am-node12   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-datanode-787849f54f-8s4xk             1/1     Running       0                8m      10.104.19.204   4am-node28   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-datanode-787849f54f-hscsj             1/1     Running       0                9m22s   10.104.16.67    4am-node21   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-indexnode-6fccf5b79-jgzbl             1/1     Running       0                53m     10.104.32.193   4am-node39   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-indexnode-6fccf5b79-nf86q             1/1     Running       0                52m     10.104.16.19    4am-node21   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-indexnode-6fccf5b79-sz7p5             1/1     Running       0                52m     10.104.21.214   4am-node24   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-mixcoord-bb675c875-nvv2n              1/1     Running       1 (48m ago)      48m     10.104.21.215   4am-node24   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-proxy-5957cd79fc-lcwzk                1/1     Running       0                4m59s   10.104.21.221   4am-node24   <none>           <none>

[2024-09-22T18:06:13.679Z] kafka-mixcoord-4975-milvus-querynode-1-576c9fc85-mzhcs           1/1     Running       0                44m     10.104.16.25    4am-node21   <none>           <none>

[2024-09-22T18:06:13.680Z] kafka-mixcoord-4975-milvus-querynode-1-576c9fc85-p5m9r           1/1     Running       0                12m     10.104.5.124    4am-node12   <none>           <none>

[2024-09-22T18:06:13.680Z] kafka-mixcoord-4975-milvus-querynode-1-576c9fc85-w2tkm           1/1     Running       0                43m     10.104.32.204   4am-node39   <none>           <none>

[2024-09-22T18:06:13.680Z] kafka-mixcoord-4975-minio-0                                      1/1     Running       0                62m     10.104.23.131   4am-node27   <none>           <none>

[2024-09-22T18:06:13.680Z] kafka-mixcoord-4975-minio-1                                      1/1     Running       0                62m     10.104.34.117   4am-node37   <none>           <none>

[2024-09-22T18:06:13.680Z] kafka-mixcoord-4975-minio-2                                      1/1     Running       0                62m     10.104.25.151   4am-node30   <none>           <none>

[2024-09-22T18:06:13.680Z] kafka-mixcoord-4975-minio-3                                      1/1     Running       0                62m     10.104.18.19    4am-node25   <none>           <none>

Anything else?

No response

zhuwenxing commented 3 weeks ago

/assign @weiliu1031 PTAL