nats-io / nats.go

Golang client for NATS, the cloud native messaging system.
https://nats.io
Apache License 2.0
5.54k stars 696 forks source link

Panic when calling consContext.Stop() #1453

Closed evanofslack closed 11 months ago

evanofslack commented 1 year ago

Observed behavior

When making concurrent calls to consContext.Stop(), there exists a race condition that can lead to a panic due to closing a closed chan.

panic: close of closed channel

goroutine 10708 [running]:
github.com/nats-io/nats.go/jetstream.(*pullSubscription).Stop(0x140000a6b00)
        /Users/eslack/go/pkg/mod/github.com/nats-io/nats.go@v1.30.2/jetstream/pull.go:596 +0x40
main.raceToStopConsumer.func2()
        /Users/eslack/Desktop/personal/projects/nats-consumer-nil/main.go:62 +0x30
created by main.raceToStopConsumer
        /Users/eslack/Desktop/personal/projects/nats-consumer-nil/main.go:61 +0x170
exit status 2

This appears to happen due to improper guarding of the close call in jetstream/pull.go.

if atomic.LoadUint32(&s.closed) == 1 {
    return
}
atomic.StoreUint32(&s.closed, 1)
close(s.done)

It appears there is a potential race between the load and the store operations. This should likely be an atomic.CompareAndSwapUint32 instead.

Expected behavior

The expectation is that consContext.Stop() never panics

Server and client version

nats-server v2.9.15 client: 1.31.0

Host environment

No response

Steps to reproduce

Please see this gist that reproduces the issue. https://gist.github.com/evanofslack/58ca4786dd37043efe4497e9d665c6d0

piotrpio commented 11 months ago

Thank you for the PR. It's merged so I'll be closing the issue, fix will be included in the next release.