Closed eric-sap closed 4 years ago
Design decisions for this issue are as follows:
This will allow us to use logic inside the pod to allow graceful shutdown of ConsumerGroups and the Producer.
For example, if the only value changed is part of the Consumer section of the sarama.Config struct, the Producer should not be restarted. This may require keeping a local copy of the old sarama.Config and comparing the structs of interest.
Obviously some logic will be needed on a per-component basis, but the main code to watch and respond to our configmap changes should be in "common" to the extent that it can.
/assign @eric-sap
Currently the Sarama configuration is set (in pkg/common/kafka/util/util.go) via some parameters and constants. We would like to use a configmap for these settings where possible and reasonable. The eventing-contrib/kafka implementation already has support for such a map, though with minimal content. We’d like to reuse whatever makes sense (such as map watching) but with our own defined content and parsing. This configmap may also include any other custom eventing-kafka config as well (i.e. it will not be Sarama-specific).
Details
Watching the Configmap
This is what eventing-contrib/kafka does, which seems like a good starting-point for eventing-kafka work toward a similar end:
kafka/channel/pkg/reconciler/controller/controller.go::NewController (passed in from sharedmain) uses the passed-in configmap.Watcher to call cmw.Watch("config-kafka", updateKafkaConfig), where "updateKafkaConfig" is a function that loads the configmap data and replaces the Reconciler.kafkaConfig with the one parsed out of the configmap.
The configmap parser itself uses code from knative.dev/pkg/configmap/parse.go that might be useful for our purposes as well (common way to parse out durations, ints, floats, etc).
If we don't want to switch to using sharedmain, we will need to create the watcher ourselves via sharedmain.SetupConfigMapWatchOrDie, which is what the pkg/common/k8s/observability.go code already does (not sure if there's a benefit to sharing a single watcher or not)
Loading Sarama Settings from the Configmap
Whether we need the huge block of all of the sarama settings or not, putting them as JSON into a configmap is a relatively ordinary procedure and results in something that looks like this:
Of course, the point here is that you can omit any entries from the "sarama" JSON that you want to leave as the defaults, so you could just have something that looks considerably simpler:
Merging the JSON with the default sarama.Config is also not complex:
If we need to save our current settings back to a configmap, that gets complicated (and will likely require our own in-sync copy of the sarama.Config struct with custom tags to ignore some JSON-unfriendly fields). This is possible and has been POC'd but will hopefully not be necessary for this issue (writing to the configmap will only be done manually when an administrator wants to make changes).
Implementing Config Changes
Regarding the procedures necessary to rebuild/restart sarama infrastructure when the config map changes, there are three main components of note:
The ReconcileKind function calls SetKafkaAdminClient(), which calls kafkaadmin.CreateAdminClient() every time, so we don't have an "old admin client with old settings" to worry about for the moment. This will change if we decide to keep the admin client around in the future but for now this appears to be a non-issue (the only change would be to load the settings from whatever the current configmap is instead of using constants and/or environment variables).
The implementation for the ConsumerGroup is all in one place, but there could be several pods, which would need to be reconfigured (or recreated). The UpdateSubscriptions() function loops through the subscribers and calls CreateConsumerGroup whenever it finds no subscriber wrapper for a particular subscriber UID. The ConsumerGroup goes into the SubscriberWrapper and remains as part of the DispatcherImpl (created in the dispatcher's main.go and remaining for the life of the pod). If the ConfigMap watcher is only going to run on the controller, it might be easiest to restart all of the dispatcher pods to pick up any new configuration data (the controller could parse the configmap and see if the changes are actually relevant to the ConsumerGroup first if desired)
Risk of trouble from restarting dispatcher pods is low, since the critical data is all stored in a kafka topic, not the eventing-kafka pods themselves, but unnecessary pod churn is, of course, preferably avoided. We could easily start with just having the controller restart the dispatcher pods when the configmap changes (which will presumably not be particularly often). Adjustments to a working system are likely easier than trying to work through all of the potential logic of changing the ConsumerGroups on the fly (either by having the dispatchers also watch the configmap or having a different IPC mechanism from the controller).
The Producer is similar to the ConsumerGroup in that it is created (once, in the NewProducer() function called by the channel's main.go code) and then used repeatedly (during handleMessage() which is called by when a message is received (due to the eventingchannel.NewMessageReceiver() call in the channel's main.go).
If we handle configmap changes by restarting the producer pod, we must be certain that we do not encounter any situation in which an event has been received (i.e. handleMessage has been called) and a non-error returned, but the message not persisted in the Kafka topic.
I don't see any real way for this to happen, as the ProduceKafkaMessage waits for the delivery report via the SyncProducer's SendMessage function before giving handleMessage the go-ahead to return no-error. If the process is terminated anywhere in that execution path, the caller of handleMessage (which is the ServeHTTP function in eventing's message_receiver.go file) will get an error (network or otherwise) that will be passed back to the caller (that is, the customer). Errors are acceptable for a brief time during reconfiguration; if less downtime is desired, the producer could watch the configmap directly and reconfigure the SyncProducer itself instead of letting the controller bounce it.
As with the ConsumerGroup it is probably best to start with a naïve restarting of pods and see if the required performance during configuration changes demands a modification to that approach.