IBMStreams / streamsx.kafka

Repository for integration with Apache Kafka
https://ibmstreams.github.io/streamsx.kafka/
Apache License 2.0
13 stars 9 forks source link

Consumer: Enable group management also when no group-ID is given #183

Closed ghost closed 4 years ago

ghost commented 4 years ago

The Kafka consumer should always subscribe to given topics rather than assign to topic partitions when this self-assignment is not mandatory.

A) When the user does not specify a group identifier, and does not specify the partitions to consume, the consumer operator creates a unique group identifier for the operator (as the Kafka API requires this), but self-assigns all partitions of the specified topic(s).

B) When the user specifies a unique group identifier for a single consumer, the consumer subscribes and benefits from group management (Kafka assigns the partitions).

The main difference between A) and B) is, that changed number of partitions will need a PE relaunch in A) to consume also the new partitions, whereas the operator in B) will automatically consume the new topic partitions as Kafka detects changed topic meta data.

Summary: The Consumer operator shall always subscribe as long as it is not mandatory that the operator self-assigns the partitions to consume.

Self-assignment is mandatory in these cases:

This function is a behavioral change. Needs to be implemented in next major version.

ghost commented 4 years ago

resolved in Release 3.0.0.