milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.36k stars 2.91k forks source link

[Bug]: Using Pulsar as a message queue and enabling the streaming node, deploying in standalone mode will result in a panic. #36385

Closed zhuwenxing closed 1 month ago

zhuwenxing commented 1 month ago

Is there an existing issue for this?

Environment

- Milvus version:master-20240919-f6526121-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

command for install

helm install --wait --debug --timeout 600s standalone-pod-kill-7 milvus/milvus --set streaming.enabled=true --set pulsar.enabled=true --set kafka.enabled=false --set image.all.repository=harbor.milvus.io/milvus/milvus --set image.all.tag=master-20240919-f6526121-amd64 --set metrics.serviceMonitor.enabled=true --set etcd.metrics.enabled=true --set etcd.metrics.podMonitor.enabled=true --set etcd.metrics.podMonitor.namespace=chaos-testing --set quotaAndLimits.enabled=false -f ../standalone-values.yaml -n=chaos-testing
[2024/09/20 03:46:27.956 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
[2024/09/20 03:46:27.958 +00:00] [DEBUG] [sessionutil/session_util.go:272] ["Session connect to etcd success"]
[2024/09/20 03:46:27.958 +00:00] [INFO] [streamingnode/service.go:307] ["StreamingNode try to wait for DataCoord ready"]
[2024/09/20 03:46:27.959 +00:00] [DEBUG] [sessionutil/session_util.go:620] ["SessionUtil GetSessions"] [prefix=datacoord] [key=datacoord] [address=10.104.20.251:13333]
[2024/09/20 03:46:27.959 +00:00] [DEBUG] [client/client.go:93] ["DataCoordClient GetSessions success"] [address=10.104.20.251:13333] [serverID=18]
[2024/09/20 03:46:27.960 +00:00] [DEBUG] [datacoord/services.go:686] ["DataCoord current state"] [StateCode=Healthy]
[2024/09/20 03:46:27.960 +00:00] [INFO] [componentutil/componentutil.go:61] ["WaitForComponentStates success"] ["current state"=Healthy]
[2024/09/20 03:46:27.960 +00:00] [INFO] [streamingnode/service.go:327] ["create StreamingNode server..."]
[2024/09/20 03:46:27.960 +00:00] [INFO] [syncmgr/sync_manager.go:61] ["sync manager initialized"] [initPoolSize=256]
[2024/09/20 03:46:27.961 +00:00] [INFO] [server/server.go:34] ["init streamingnode server..."]
[2024/09/20 03:46:27.961 +00:00] [INFO] [streamingnode/service.go:181] ["init StreamingNode server finished"]
panic: mq %!s(MISSING) is only valid in standalone mode

goroutine 673 [running]:
panic({0x6644d00?, 0xc002753680?})
    /usr/local/go/src/runtime/panic.go:1017 +0x3ac fp=0xc001d03cb0 sp=0xc001d03c00 pc=0x2179d8c
github.com/milvus-io/milvus/internal/util/streamingutil/util.mustSelectWALName(0x20?, {0xc0011736d8, 0x7}, {0x1, 0x1, 0x1, 0x0})
    /workspace/source/internal/util/streamingutil/util/wal_selector.go:48 +0xc5 fp=0xc001d03d10 sp=0xc001d03cb0 pc=0x48d4945
github.com/milvus-io/milvus/internal/util/streamingutil/util.MustSelectWALName()
    /workspace/source/internal/util/streamingutil/util/wal_selector.go:36 +0xa5 fp=0xc001d03d58 sp=0xc001d03d10 pc=0x48d4865
github.com/milvus-io/milvus/internal/streamingnode/server/walmanager.OpenManager()
    /workspace/source/internal/streamingnode/server/walmanager/manager_impl.go:21 +0x1f fp=0xc001d03e20 sp=0xc001d03d58 pc=0x5dff65f
github.com/milvus-io/milvus/internal/streamingnode/server.(*Server).initBasicComponent(...)
    /workspace/source/internal/streamingnode/server/server.go:64
github.com/milvus-io/milvus/internal/streamingnode/server.(*Server).Init(0xc002889080, {0x72993c8, 0xa09b1a0})
    /workspace/source/internal/streamingnode/server/server.go:36 +0x3f fp=0xc001d03e58 sp=0xc001d03e20 pc=0x5e0bbdf
github.com/milvus-io/milvus/internal/distributed/streamingnode.(*Server).init(0xc0017fc630, {0x72993c8, 0xa09b1a0})
    /workspace/source/internal/distributed/streamingnode/service.go:218 +0x2a9 fp=0xc001d03f18 sp=0xc001d03e58 pc=0x5e0cb49
github.com/milvus-io/milvus/internal/distributed/streamingnode.(*Server).Run(0x3?)
    /workspace/source/internal/distributed/streamingnode/service.go:107 +0x2a fp=0xc001d03f58 sp=0xc001d03f18 pc=0x5e0c14a
github.com/milvus-io/milvus/cmd/components.(*StreamingNode).Run(0xc0008273e0?)
    <autogenerated>:1 +0x1f fp=0xc001d03f70 sp=0xc001d03f58 pc=0x5e126df
github.com/milvus-io/milvus/cmd/roles.runComponent[...].func1()
    /workspace/source/cmd/roles/roles.go:125 +0xde fp=0xc001d03fe0 sp=0xc001d03f70 pc=0x5e1bbbe
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc001d03fe8 sp=0xc001d03fe0 pc=0x21b35e1
created by github.com/milvus-io/milvus/cmd/roles.runComponent[...] in goroutine 1
    /workspace/source/cmd/roles/roles.go:117 +0x138

[2024-09-20T03:48:44.280Z] + kubectl get pods -o wide

[2024-09-20T03:48:44.281Z] + grep standalone-pod-kill-7

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-etcd-0                                     1/1     Running            0               8m35s   10.104.24.41    4am-node29   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-etcd-1                                     1/1     Running            0               8m35s   10.104.26.213   4am-node32   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-etcd-2                                     1/1     Running            0               8m35s   10.104.16.187   4am-node21   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-milvus-standalone-585f98c578-xtn87         0/1     CrashLoopBackOff   6 (2m16s ago)   8m35s   10.104.20.251   4am-node22   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-minio-7fdbc86bd-hq4mk                      1/1     Running            0               8m35s   10.104.30.17    4am-node38   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-bookie-0                            1/1     Running            0               8m35s   10.104.21.30    4am-node24   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-bookie-1                            1/1     Running            0               8m35s   10.104.17.131   4am-node23   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-bookie-2                            1/1     Running            0               8m35s   10.104.30.19    4am-node38   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-bookie-init-f4bbh                   0/1     Completed          0               8m35s   10.104.6.226    4am-node13   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-broker-0                            1/1     Running            0               8m35s   10.104.6.228    4am-node13   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-proxy-0                             1/1     Running            0               8m35s   10.104.30.18    4am-node38   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-pulsar-init-vlqqx                   0/1     Completed          0               8m35s   10.104.25.178   4am-node30   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-recovery-0                          1/1     Running            0               8m35s   10.104.21.29    4am-node24   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-zookeeper-0                         1/1     Running            0               8m35s   10.104.24.42    4am-node29   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-zookeeper-1                         1/1     Running            0               8m4s    10.104.32.150   4am-node39   <none>           <none>

[2024-09-20T03:48:44.537Z] standalone-pod-kill-7-pulsar-zookeeper-2                         1/1     Running            0               7m33s   10.104.27.113   4am-node31   <none>           <none>

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

zhuwenxing commented 1 month ago

/assign @yellow-shine /assign @chyezh PTAL

chyezh commented 1 month ago

add values:

.standalone.messageQueue="pulsar"