segmentio / kafka-go

Kafka library in Go
MIT License
7.44k stars 771 forks source link

Handle Rotating TLS/SSL Certificates without Restart #1100

Open jkratz55 opened 1 year ago

jkratz55 commented 1 year ago

This is more of a discussion/question than an issue. We are looking for a solution to be able to rotate our certificates for Kafka without restarting the application. We have so many microservices, most of which aren't using consumer groups for legacy architecture reasons, so it's an operational nightmare for us.

I haven't taken a deep dive into the code but since kafka.Dialer type has a pointer to tls.Config I was thinking about using fsnotify to watch the certificate files on the filesystem, and if a change is detected, update the tls.Config object the Dialer has a handle too. Then in theory (totally just guessing) when a new connection is established it will do so with the updated tls.Config.

But before even attempting to go down this road, I was curious if there is any support in the library or recommendations from the maintainers.

dmarkhas commented 1 year ago

AFAIK this is a long standing issue in the Kafka ecosystem (in all client languages).

I would suggest that instead of coding the rotation logic in the library, it should be externalized so the user can decide when / how to refresh the certificate, perhaps by placing a message on a channel..

On the other hand I do think this is a potentially risky behavior - to rotate the certificate you would need to close the connection, replace the dialer and reconnect, which could lead to data loss in the client as well as a "rebalance storm" if dozens of clients rotate their certificate simultaneously.

petedannemann commented 1 year ago

Hi @jkratz55, your proposal seems reasonable. I think after the certificate is rotated you'd need to close the Conn opened from the Dialer and then open a new connection. Externalizing this outside of the library as @dmarkhas suggested seems like the best way forward since this is fairly opinionated.