cloudevents / sdk-go

Go SDK for CloudEvents
https://cloudevents.github.io/sdk-go/
Apache License 2.0
834 stars 219 forks source link

Panic on Kafka IP change #1028

Open canni opened 7 months ago

canni commented 7 months ago

We're experiencing a panic due to send on a closed channel here: https://github.com/cloudevents/sdk-go/blob/release-v2.15.2/protocol/kafka_sarama/v2/receiver.go#L80

I've traced this to the fact that the incoming channel is closed in SDK managed goroutine here: https://github.com/cloudevents/sdk-go/blob/release-v2.15.2/protocol/kafka_sarama/v2/receiver.go#L182

But the ConsumeClaim is running in sarama's internal goroutine, and it's session context is unrelated. This creates situaction where closing the receiver prematurely closes the incoming channel used by ConsumeClaim

This panic is crashing the whole go process, as wrapping StartReceiver with recover() does not cover that goroutine.

embano1 commented 7 months ago

Hi @canni thx for filing: quick question -> are you running Confluent Kafka or another environment? We have a PR ready for Confluent Kafka support where we tried to make sure to avoid races when closing. Need to look closer into the sarama implementation or do you want to file a PR?

embano1 commented 7 months ago

@yanmxa FYI, that's why I was so picky during the PR review :)

canni commented 7 months ago

Yes this is when we connect to cluster using confluent docker containers

embano1 commented 7 months ago

Would you be open to switch and try our upcoming kafka_confluent binding?

canni commented 7 months ago

I don't know if we can find time to get around to testing that before Easter break, I'll report after the trial run

embano1 commented 7 months ago

No worries, #988 merge still pending (though close). Then we need to cut a new release so you can easily switch. Appreciate the support.

embano1 commented 7 months ago

@canni https://github.com/cloudevents/sdk-go/pull/988 is merged