nats-io / nats.go

Golang client for NATS, the cloud native messaging system.
https://nats.io
Apache License 2.0
5.53k stars 696 forks source link

Nil pointer dereference in new Jetstream API when getting consumer #1422

Closed evanofslack closed 1 year ago

evanofslack commented 1 year ago

What version were you using?

server: 2.9.15 client: 1.28.0

What environment was the server running in?

Server running on x86 from 2.9.15-alpine container

Is this defect reproducible?

Has occurred multiple times during testing. Working towards reproducing, havent gotten a minimal example yet.

Given the capability you are leveraging, describe your expectation?

There are multiple pods running the same code, and each pod will attempt to get a certain consumer by name and if the consumer doesnt exist, will attempt to create the consumer.

This bug tends to occur on pod startup, where it have a list of consumers to try and get/create. This bug occured typically when working with 500- 10,000 consumers.

The consumers are on a work queue, so the CreateOrUpdate command does not work, as it returns the error about filters needing to be unique on work queue. The consumers are ephemeral.

The expectation is that there are no crashes due to nil pointer dereferences.

Given the expectation, what is the defect you are observing?

panic: runtime error: invalid memory address or nil pointer dereference                                                           │
[signal SIGSEGV: segmentation violation code=0x1 addr=0x50 pc=0xbfcb19]                                                           │
                                                                                                                                  │
goroutine 6151 [running]:                                                                                                         │
github.com/nats-io/nats.go/jetstream.getConsumer({0x1dbca10?, 0xc00007cc80?}, 0xc001635140, {0x1953ca0, 0x2}, {0xc002ba48a0, 0xa})│
        /home/users/eslack/go/pkg/mod/github.com/nats-io/nats.go@v1.28.0/jetstream/consumer.go:178 +0x499                         │
github.com/nats-io/nats.go/jetstream.(*jetStream).Consumer(0xc00007f620?, {0x1dbca10, 0xc00007cc80}, {0x1953ca0, 0x2}, {0xc002ba48│
        /home/users/eslack/go/pkg/mod/github.com/nats-io/nats.go@v1.28.0/jetstream/jetstream.go:512 +0x8d                         
piotrpio commented 1 year ago

Hey @evanofslack, thank you for reporting the issue. I was not able to reproduce yet, but it looks like we get a response with no error and no consumer config from the server, although I'm not 100% sure yet. It looks like it's a similar issue to this: https://github.com/nats-io/nats.go/issues/1258, where there is also panic when getting a consumer. I'll investigate and let you know.

evanofslack commented 1 year ago

Hello @piotrpio thanks for looking into this.

I was able to recreate something similar to this. Please see the code in this gist: https://gist.github.com/evanofslack/97a995458c3aac6ca337b3baa5bd9d43

piotrpio commented 1 year ago

@evanofslack thank you, in the meantime we were able to find and fix this issue in nats-server: https://github.com/nats-io/nats-server/pull/4610/files

So that will be fixed in the upcoming server release. I will also add a fix in go client to safeguard from panics on older server versions.

piotrpio commented 1 year ago

Both PRs are merged and will be part of upcoming releases.

evanofslack commented 1 year ago

Thanks for triaging and fixing! @piotrpio