nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.49k stars 1.38k forks source link

Node Panic on NATS Server v2.5.0 (JetStream) #2578

Closed andreib1 closed 1 year ago

andreib1 commented 2 years ago

One of the nodes in our 3 node k8s cluster wouldn't start, and output the following panic stack trace. The only way we could restore it was to re-deploy NATS and drop the persistent volumes. Hopefully the stack trace gives some clue. Prior to this I had attempted to delete the streams and consumers using NATS CLI

NATS Server Version: 2.5.0 Deployment method: Kubernetes Helm Chart Version 0.8.9

2021-09-27 17:42:34.539 BST
panic: runtime error: slice bounds out of range [162358:0] goroutine 1 [running]: github.com/nats-io/nats-server/server.(*msgBlock).cacheLookup(0xc0000dab60, 0x18f7, 0x0, 0xbba540, 0xc000488260) /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:3181 +0x70a github.com/nats-io/nats-server/server.(*msgBlock).generatePerSubjectInfo(0xc0000dab60, 0x0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:4102 +0x108 github.com/nats-io/nats-server/server.(*msgBlock).readPerSubjectInfo(0xc0000dab60, 0x0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:4133 +0x66f github.com/nats-io/nats-server/server.(*fileStore).recoverMsgBlock(0xc000512280, 0xbc6e60, 0xc0000f7d40, 0x3, 0x0, 0x0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:594 +0x838 github.com/nats-io/nats-server/server.(*fileStore).recoverMsgs(0xc000512280, 0x0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:868 +0x536 github.com/nats-io/nats-server/server.newFileStoreWithCreated(0xc0000a6f60, 0x20, 0x800000, 0x12a05f200, 0x1bf08eb000, 0x0, 0xc0000a0d90, 0x5, 0x0, 0x0, ...) /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:305 +0x7b8 github.com/nats-io/nats-server/server.(*stream).setupStore(0xc000512000, 0xc000158f90, 0x0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2332 +0x358 github.com/nats-io/nats-server/server.(*Account).addStreamWithAssignment(0xc000110900, 0xc0000d4108, 0x0, 0x0, 0x0, 0x0, 0xc0000d2210) /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:379 +0x8d9 github.com/nats-io/nats-server/server.(*Account).addStream(0xc000110900, 0xc0000d4108, 0x200, 0xa11a40, 0xc0000d40f0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:236 +0x3d github.com/nats-io/nats-server/server.(*Account).EnableJetStream(0xc000110900, 0x0, 0x0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:1096 +0x1994 github.com/nats-io/nats-server/server.(*Server).enableJetStreamAccounts(0xc00019a000, 0xaf6da1, 0x7) /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:534 +0x8e github.com/nats-io/nats-server/server.(*Server).enableJetStream(0xc00019a000, 0x80000000, 0x500000000, 0xc0000a02c0, 0xf, 0x0, 0x0, 0xa5dac0, 0x0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:358 +0x867 github.com/nats-io/nats-server/server.(*Server).EnableJetStream(0xc00019a000, 0xc000159e30, 0x0, 0xc0000a80b0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:182 +0x316 github.com/nats-io/nats-server/server.(*Server).Start(0xc00019a000) /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1644 +0xa18 github.com/nats-io/nats-server/server.Run(...) /home/travis/gopath/src/github.com/nats-io/nats-server/server/service.go:21 main.main() /home/travis/gopath/src/github.com/nats-io/nats-server/main.go:118 +0x188
wallyqs commented 2 years ago

Hi @andreib1, thanks for the report. This should have been solved in the NATS Server v2.6.0 release via #2545 along other restart improvements so recommend to upgrade.