milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.3k stars 2.81k forks source link

[Bug]: Milvus install failed due to connecting to Minio failure #21097

Closed zhuwenxing closed 1 year ago

zhuwenxing commented 1 year ago

Is there an existing issue for this?

Environment

- Milvus version:2.2.0-20221208-61592cde
- Deployment mode(standalone or cluster): cluster
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

image

querynode log:

[2022/12/08 23:31:33.355 +00:00] [WARN] [storage/minio_chunk_manager.go:103] ["failed to check blob bucket exist"] [bucket=milvus-bucket] [error="Server not initialized, please try again."]
[2022/12/08 23:31:36.356 +00:00] [WARN] [storage/minio_chunk_manager.go:103] ["failed to check blob bucket exist"] [bucket=milvus-bucket] [error="Server not initialized, please try again."]
[2022/12/08 23:31:39.357 +00:00] [ERROR] [querynode/query_node.go:256] ["QueryNode init vector storage failed"] [error="All attempts results:\nattempt #1:Server not initialized, please try again.\nattempt #2:Server not initialized, please try again.\nattempt #3:Server not initialized, please try again.\nattempt #4:Server not initialized, please try again.\nattempt #5:Server not initialized, please try again.\nattempt #6:Server not initialized, please try again.\nattempt #7:Server not initialized, please try again.\nattempt #8:Server not initialized, please try again.\nattempt #9:Server not initialized, please try again.\nattempt #10:Server not initialized, please try again.\nattempt #11:Server not initialized, please try again.\n"] [stack="github.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/query_node.go:256\nsync.(*Once).doSlow\n\t/usr/local/go/src/sync/once.go:68\nsync.(*Once).Do\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/query_node.go:234\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:133\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:213\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:54\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:102"]
[2022/12/08 23:31:39.357 +00:00] [ERROR] [querynode/service.go:134] ["QueryNode init error: "] [error="All attempts results:\nattempt #1:Server not initialized, please try again.\nattempt #2:Server not initialized, please try again.\nattempt #3:Server not initialized, please try again.\nattempt #4:Server not initialized, please try again.\nattempt #5:Server not initialized, please try again.\nattempt #6:Server not initialized, please try again.\nattempt #7:Server not initialized, please try again.\nattempt #8:Server not initialized, please try again.\nattempt #9:Server not initialized, please try again.\nattempt #10:Server not initialized, please try again.\nattempt #11:Server not initialized, please try again.\n"] [stack="github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:134\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:213\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:54\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:102"]
panic: All attempts results:
attempt #1:Server not initialized, please try again.
attempt #2:Server not initialized, please try again.
attempt #3:Server not initialized, please try again.
attempt #4:Server not initialized, please try again.
attempt #5:Server not initialized, please try again.
attempt #6:Server not initialized, please try again.
attempt #7:Server not initialized, please try again.
attempt #8:Server not initialized, please try again.
attempt #9:Server not initialized, please try again.
attempt #10:Server not initialized, please try again.
attempt #11:Server not initialized, please try again.

goroutine 243 [running]:
github.com/milvus-io/milvus/cmd/components.(*QueryNode).Run(0x569bfe0?)
    /go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:55 +0x56
github.com/milvus-io/milvus/cmd/roles.runComponent[...].func1()
    /go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:102 +0x18e
created by github.com/milvus-io/milvus/cmd/roles.runComponent[...]
    /go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:85 +0x18a

Minio log:

MINIO_POD_KILL_365_KAFKA_SERVICE_HOST MINIO_POD_KILL_365_KAFKA_SERVICE_PORT MINIO_POD_KILL_365_KAFKA_METRICS_SERVICE_PORT MINIO_POD_KILL_365_KAFKA_JMX_METRICS_SERVICE_HOST MINIO_POD_KILL_365_PORT_9000_TCP_PROTO MINIO_POD_KILL_365_ZOOKEEPER_PORT_2181_TCP_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT_2181_TCP_ADDR MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_9091_TCP] (*fmt.wrapError)
       2: cmd/bootstrap-peer-server.go:209:cmd.verifyServerSystemConfig()
       1: cmd/server-main.go:491:cmd.serverMain()

API: SYSTEM()
Time: 23:31:52 UTC 12/08/2022
Error: http://kafka-pod-failure-369-minio-3.kafka-pod-failure-369-minio-svc.chaos-testing.svc.cluster.local:9000/export has incorrect configuration: Expected same MINIO_ environment variables and values across all servers: Missing environment values: [MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_9091_TCP_PROTO MINIO_POD_KILL_365_MILVUS_SERVICE_PORT MINIO_POD_KILL_365_KAFKA_PORT_9092_TCP_PORT MINIO_POD_KILL_365_MILVUS_ROOTCOORD_SERVICE_HOST MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_13333_TCP_ADDR MINIO_POD_KILL_365_ZOOKEEPER_PORT_3888_TCP_PROTO MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_9091_TCP_ADDR MINIO_POD_KILL_365_ETCD_PORT_2379_TCP_ADDR MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_19531_TCP_PORT MINIO_POD_KILL_365_ETCD_SERVICE_HOST MINIO_POD_KILL_365_ETCD_PORT_2380_TCP_ADDR MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_53100_TCP_PROTO MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_9091_TCP_PROTO MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_9091_TCP_PORT MINIO_POD_KILL_365_KAFKA_METRICS_PORT_9308_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_9091_TCP_ADDR MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_9091_TCP_ADDR MINIO_POD_KILL_365_MILVUS_DATACOORD_SERVICE_HOST MINIO_POD_KILL_365_KAFKA_JMX_METRICS_PORT_5556_TCP MINIO_POD_KILL_365_MILVUS_PORT_19530_TCP_PORT MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT MINIO_POD_KILL_365_PORT_9000_TCP MINIO_POD_KILL_365_ZOOKEEPER_PORT_2888_TCP_PORT MINIO_POD_KILL_365_KAFKA_JMX_METRICS_PORT_5556_TCP_PROTO MINIO_POD_KILL_365_MILVUS_PORT_9091_TCP_ADDR MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_9091_TCP MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_9091_TCP_PROTO MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_9091_TCP_PORT MINIO_POD_KILL_365_ZOOKEEPER_SERVICE_PORT_TCP_CLIENT MINIO_POD_KILL_365_ZOOKEEPER_PORT_2181_TCP_PROTO MINIO_POD_KILL_365_ETCD_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT_3888_TCP_PORT MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_53100_TCP_PORT MINIO_POD_KILL_365_SERVICE_PORT_HTTP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_9091_TCP_PROTO MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_9091_TCP_ADDR MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_9091_TCP_PORT MINIO_POD_KILL_365_ZOOKEEPER_SERVICE_PORT_TCP_ELECTION MINIO_POD_KILL_365_ETCD_PORT_2380_TCP_PORT MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_19531_TCP MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_13333_TCP_PROTO MINIO_POD_KILL_365_MILVUS_ROOTCOORD_SERVICE_PORT MINIO_POD_KILL_365_ETCD_PORT_2380_TCP MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT MINIO_POD_KILL_365_MILVUS_SERVICE_HOST MINIO_POD_KILL_365_ZOOKEEPER_SERVICE_HOST MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_19531_TCP_ADDR MINIO_POD_KILL_365_MILVUS_DATACOORD_SERVICE_PORT_METRICS MINIO_POD_KILL_365_KAFKA_JMX_METRICS_PORT_5556_TCP_ADDR MINIO_POD_KILL_365_KAFKA_METRICS_PORT_9308_TCP_ADDR MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_53100_TCP_ADDR MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_13333_TCP MINIO_POD_KILL_365_KAFKA_PORT_9092_TCP_PROTO MINIO_POD_KILL_365_ETCD_PORT_2379_TCP_PORT MINIO_POD_KILL_365_SERVICE_PORT MINIO_POD_KILL_365_PORT_9000_TCP_PORT MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_9091_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_31000_TCP_ADDR MINIO_POD_KILL_365_MILVUS_QUERYCOORD_SERVICE_HOST MINIO_POD_KILL_365_MILVUS_QUERYCOORD_SERVICE_PORT MINIO_POD_KILL_365_KAFKA_JMX_METRICS_PORT_5556_TCP_PORT MINIO_POD_KILL_365_KAFKA_JMX_METRICS_SERVICE_PORT MINIO_POD_KILL_365_MILVUS_PORT_19530_TCP MINIO_POD_KILL_365_ZOOKEEPER_PORT_2181_TCP MINIO_POD_KILL_365_ZOOKEEPER_PORT_3888_TCP_ADDR MINIO_POD_KILL_365_MILVUS_INDEXCOORD_SERVICE_PORT_METRICS MINIO_POD_KILL_365_MILVUS_PORT_19530_TCP_PROTO MINIO_POD_KILL_365_MILVUS_ROOTCOORD_SERVICE_PORT_METRICS MINIO_POD_KILL_365_KAFKA_METRICS_PORT_9308_TCP_PORT MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_9091_TCP MINIO_POD_KILL_365_MILVUS_ROOTCOORD_SERVICE_PORT_ROOTCOORD MINIO_POD_KILL_365_KAFKA_METRICS_PORT MINIO_POD_KILL_365_KAFKA_JMX_METRICS_SERVICE_PORT_HTTP_METRICS MINIO_POD_KILL_365_ZOOKEEPER_SERVICE_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT MINIO_POD_KILL_365_ETCD_SERVICE_PORT_CLIENT MINIO_POD_KILL_365_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT_2888_TCP_ADDR MINIO_POD_KILL_365_KAFKA_SERVICE_HOST MINIO_POD_KILL_365_KAFKA_METRICS_SERVICE_PORT_HTTP_METRICS MINIO_POD_KILL_365_KAFKA_METRICS_SERVICE_HOST MINIO_POD_KILL_365_KAFKA_SERVICE_PORT MINIO_POD_KILL_365_KAFKA_METRICS_SERVICE_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT_2181_TCP_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT_2181_TCP_ADDR MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_9091_TCP MINIO_POD_KILL_365_KAFKA_JMX_METRICS_SERVICE_HOST MINIO_POD_KILL_365_PORT_9000_TCP_PROTO MINIO_POD_KILL_365_MILVUS_QUERYCOORD_PORT_19531_TCP_PROTO MINIO_POD_KILL_365_ETCD_PORT_2379_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_SERVICE_PORT_INDEXCOORD MINIO_POD_KILL_365_ETCD_SERVICE_PORT_PEER MINIO_POD_KILL_365_ETCD_SERVICE_PORT MINIO_POD_KILL_365_ZOOKEEPER_PORT_3888_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_SERVICE_HOST MINIO_POD_KILL_365_ETCD_PORT_2380_TCP_PROTO MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_31000_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_SERVICE_PORT MINIO_POD_KILL_365_MILVUS_SERVICE_PORT_METRICS MINIO_POD_KILL_365_MILVUS_PORT_9091_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_31000_TCP_PORT MINIO_POD_KILL_365_KAFKA_PORT_9092_TCP MINIO_POD_KILL_365_ETCD_PORT_2379_TCP_PROTO MINIO_POD_KILL_365_KAFKA_METRICS_PORT_9308_TCP_PROTO MINIO_POD_KILL_365_MILVUS_SERVICE_PORT_MILVUS MINIO_POD_KILL_365_MILVUS_DATACOORD_SERVICE_PORT_DATACOORD MINIO_POD_KILL_365_MILVUS_PORT_9091_TCP_PROTO MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT_31000_TCP_PROTO MINIO_POD_KILL_365_MILVUS_QUERYCOORD_SERVICE_PORT_QUERYCOORD MINIO_POD_KILL_365_MILVUS_PORT_9091_TCP_PORT MINIO_POD_KILL_365_KAFKA_JMX_METRICS_PORT MINIO_POD_KILL_365_MILVUS_PORT_19530_TCP_ADDR MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_13333_TCP_PORT MINIO_POD_KILL_365_MILVUS_DATACOORD_PORT_9091_TCP_PORT MINIO_POD_KILL_365_PORT_9000_TCP_ADDR MINIO_POD_KILL_365_ZOOKEEPER_SERVICE_PORT_TCP_FOLLOWER MINIO_POD_KILL_365_ZOOKEEPER_PORT_2888_TCP MINIO_POD_KILL_365_MILVUS_INDEXCOORD_PORT MINIO_POD_KILL_365_KAFKA_SERVICE_PORT_TCP_CLIENT MINIO_POD_KILL_365_KAFKA_PORT_9092_TCP_ADDR MINIO_POD_KILL_365_MILVUS_PORT MINIO_POD_KILL_365_KAFKA_PORT MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT_53100_TCP MINIO_POD_KILL_365_SERVICE_HOST MINIO_POD_KILL_365_ZOOKEEPER_PORT_2888_TCP_PROTO MINIO_POD_KILL_365_MILVUS_ROOTCOORD_PORT MINIO_POD_KILL_365_MILVUS_QUERYCOORD_SERVICE_PORT_METRICS MINIO_POD_KILL_365_MILVUS_DATACOORD_SERVICE_PORT] (*fmt.wrapError)
       2: cmd/bootstrap-peer-server.go:209:cmd.verifyServerSystemConfig()
       1: cmd/server-main.go:491:cmd.serverMain()

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/chaos-test-kafka-for-release-cron/detail/chaos-test-kafka-for-release-cron/369/pipeline

log:

artifacts-kafka-pod-failure-369-server-logs (1).tar.gz artifacts-kafka-pod-failure-369-pytest-logs.tar.gz

Anything else?

No response

zhuwenxing commented 1 year ago

/unassign @yanliang567 /assign @LoveEachDay

Please help to take a look at why the Minio is not ready

LoveEachDay commented 1 year ago
Error: http://kafka-pod-failure-369-minio-3.kafka-pod-failure-369-minio-svc.chaos-testing.svc.cluster.local:9000/export 
has incorrect configuration: Expected same MINIO_ environment variables and values across all servers:

By default the service name in the same namespace will exported to all the pods under the same namespace as environment variable. minio will check those variables prefixed with MINIO_. If those variables are not consistent, it will complain to startup. You should not deploy a release with prefix minio.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.