Closed evanofslack closed 1 year ago
Hey @evanofslack, thank you for reporting the issue. I was not able to reproduce yet, but it looks like we get a response with no error and no consumer config from the server, although I'm not 100% sure yet. It looks like it's a similar issue to this: https://github.com/nats-io/nats.go/issues/1258, where there is also panic when getting a consumer. I'll investigate and let you know.
Hello @piotrpio thanks for looking into this.
I was able to recreate something similar to this. Please see the code in this gist: https://gist.github.com/evanofslack/97a995458c3aac6ca337b3baa5bd9d43
@evanofslack thank you, in the meantime we were able to find and fix this issue in nats-server: https://github.com/nats-io/nats-server/pull/4610/files
So that will be fixed in the upcoming server release. I will also add a fix in go client to safeguard from panics on older server versions.
Both PRs are merged and will be part of upcoming releases.
Thanks for triaging and fixing! @piotrpio
What version were you using?
server:
2.9.15
client:1.28.0
What environment was the server running in?
Server running on x86 from
2.9.15-alpine
containerIs this defect reproducible?
Has occurred multiple times during testing. Working towards reproducing, havent gotten a minimal example yet.
Given the capability you are leveraging, describe your expectation?
There are multiple pods running the same code, and each pod will attempt to get a certain consumer by name and if the consumer doesnt exist, will attempt to create the consumer.
This bug tends to occur on pod startup, where it have a list of consumers to try and get/create. This bug occured typically when working with 500- 10,000 consumers.
The consumers are on a work queue, so the
CreateOrUpdate
command does not work, as it returns the error about filters needing to be unique on work queue. The consumers are ephemeral.The expectation is that there are no crashes due to nil pointer dereferences.
Given the expectation, what is the defect you are observing?