Closed speier closed 3 weeks ago
We have a small cluster of 3 nodes hub, with one leaf node connected, deployed to Kubernetes, and recently one of the cluster members has been restarting many times, throwing this panic:
[762] 2024/10/28 17:16:04.266396 [INF] Starting nats-server [762] 2024/10/28 17:16:04.267248 [INF] Version: 2.10.22 [762] 2024/10/28 17:16:04.267254 [INF] Git: [240e9a4] [762] 2024/10/28 17:16:04.267258 [INF] Cluster: nats [762] 2024/10/28 17:16:04.267261 [INF] Name: nats-1 [762] 2024/10/28 17:16:04.267267 [INF] Node: yrzKKRBu [762] 2024/10/28 17:16:04.267270 [INF] ID: NCHVTW7JN3R6PND2R2CBOM76IFG3PTN7F4CUVIS2LMNMZQ5E3WQ2HL3R [762] 2024/10/28 17:16:04.267312 [INF] Using configuration file: /etc/nats-config/nats.conf [762] 2024/10/28 17:16:04.269297 [INF] Starting http monitor on 0.0.0.0:8222 [762] 2024/10/28 17:16:04.269772 [INF] Starting JetStream [762] 2024/10/28 17:16:04.270595 [INF] _ ___ _____ ___ _____ ___ ___ _ __ __ [762] 2024/10/28 17:16:04.270603 [INF] _ | | __|_ _/ __|_ _| _ \ __| /_\ | \/ | [762] 2024/10/28 17:16:04.270606 [INF] | || | _| | | \__ \ | | | / _| / _ \| |\/| | [762] 2024/10/28 17:16:04.270609 [INF] \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_| |_| [762] 2024/10/28 17:16:04.270612 [INF] [762] 2024/10/28 17:16:04.270615 [INF] https://docs.nats.io/jetstream [762] 2024/10/28 17:16:04.270618 [INF] [762] 2024/10/28 17:16:04.270653 [INF] ---------------- JETSTREAM ---------------- [762] 2024/10/28 17:16:04.270700 [INF] Max Memory: 0 B [762] 2024/10/28 17:16:04.270761 [INF] Max Storage: 100.00 GB [762] 2024/10/28 17:16:04.270793 [INF] Store Directory: "/data/jetstream" [762] 2024/10/28 17:16:04.270844 [INF] Domain: hub [762] 2024/10/28 17:16:04.270879 [INF] ------------------------------------------- [762] 2024/10/28 17:16:04.280885 [INF] Starting JetStream cluster [762] 2024/10/28 17:16:04.280891 [INF] Creating JetStream metadata controller [762] 2024/10/28 17:16:04.281604 [INF] JetStream cluster recovering state [762] 2024/10/28 17:16:04.283689 [INF] Listening for leafnode connections on 0.0.0.0:7422 [762] 2024/10/28 17:16:04.285627 [INF] Listening for client connections on 0.0.0.0:4222 [762] 2024/10/28 17:16:04.285883 [INF] Server is ready [762] 2024/10/28 17:16:04.286279 [INF] Cluster name is nats [762] 2024/10/28 17:16:04.286346 [INF] Listening for route connections on 0.0.0.0:6222 [762] 2024/10/28 17:16:04.297379 [INF] 10.0.64.206:6222 - rid:13 - Route connection created [762] 2024/10/28 17:16:04.297521 [INF] 10.0.64.19:6222 - rid:16 - Route connection created [762] 2024/10/28 17:16:04.297859 [INF] 10.0.64.19:6222 - rid:14 - Route connection created [762] 2024/10/28 17:16:04.297927 [INF] 10.0.64.206:6222 - rid:15 - Route connection created [762] 2024/10/28 17:16:04.348277 [INF] 10.0.64.19:6222 - rid:17 - Route connection created [762] 2024/10/28 17:16:04.354068 [INF] 10.0.64.19:6222 - rid:18 - Route connection created [762] 2024/10/28 17:16:04.383297 [INF] 10.0.64.206:6222 - rid:19 - Route connection created [762] 2024/10/28 17:16:04.386095 [WRN] Waiting for routing to be established... [762] 2024/10/28 17:16:04.386274 [WRN] JetStream has not established contact with a meta leader [762] 2024/10/28 17:16:04.433731 [INF] 10.0.64.206:6222 - rid:20 - Route connection created [762] 2024/10/28 17:16:05.253739 [INF] JetStream cluster new metadata leader: nats-2/nats panic: runtime error: index out of range [0] with length 0 goroutine 36 [running]: github.com/nats-io/nats-server/v2/server/stree.(*SubjectTree[...]).match(0xc00058c8f8?, {0xcde2c8?, 0xc0003502a0?}, {0xc00039a600?, 0x10?, 0x10?}, {0xc000121500, 0x0, 0x100}, 0xc000463bc8?) github.com/nats-io/nats-server/v2/server/stree/stree.go:286 +0x945 github.com/nats-io/nats-server/v2/server/stree.(*SubjectTree[...]).Match(0xce33a0, {0xc0000273c0, 0x1b, 0x1b}, 0xc000463c60) github.com/nats-io/nats-server/v2/server/stree/stree.go:124 +0x11b github.com/nats-io/nats-server/v2/server.(*fileStore).loadLast(0xc0001cee08, {0xc0000273c0, 0x1b}, 0xc0002564d0) github.com/nats-io/nats-server/v2/server/filestore.go:6261 +0x30e github.com/nats-io/nats-server/v2/server.(*fileStore).LoadLastMsg(0xc0003a00f9?, {0xc0000273c0?, 0xc?}, 0xc0002564d0) github.com/nats-io/nats-server/v2/server/filestore.go:6329 +0x53 github.com/nats-io/nats-server/v2/server.(*Server).jsMsgGetRequest(0xc0000f3b08, 0x0?, 0xc000203140?, 0xc000463e68?, {0xc000028540, 0x23}, {0xc0000e47e0, 0x11}, {0xc0003a0000, 0x127, ...}) github.com/nats-io/nats-server/v2/server/jetstream_api.go:3153 +0xcd2 github.com/nats-io/nats-server/v2/server.(*Server).processJSAPIRoutedRequests(0xc0000f3b08) github.com/nats-io/nats-server/v2/server/jetstream_api.go:871 +0x36b github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine.func1() github.com/nats-io/nats-server/v2/server/server.go:3794 +0x32 created by github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine in goroutine 1 github.com/nats-io/nats-server/v2/server/server.go:3792 +0x145
Did not expected to restart with a panic.
nats-server v2.10.22
Kubernetes, GKE AutoPilot
Client Version: v1.28.2 Server Version: v1.31.1-gke.1678000
No response
Thanks for the report, have posted a fix PR.
Observed behavior
We have a small cluster of 3 nodes hub, with one leaf node connected, deployed to Kubernetes, and recently one of the cluster members has been restarting many times, throwing this panic:
Expected behavior
Did not expected to restart with a panic.
Server and client version
nats-server v2.10.22
Host environment
Kubernetes, GKE AutoPilot
Client Version: v1.28.2 Server Version: v1.31.1-gke.1678000
Steps to reproduce
No response