Open im-konge opened 3 months ago
Triaged on the Community call on 8.8.2024: @im-konge will prepare a summary of what Strimzi parts might be affected by this and how.
These part are (in my opinion and knowledge) affected by this issue:
- CruiseControl -> IIRC CC is sending messages to some internal topic to generate the model for rebalancing. When user sets the produce and fetch quotas, the CC can be affected as well.
So, what do we consider the minimal produce / fetch limit for Cruise Control to work?
- I think that other components like MM2 or Connect/Connector can be affected as well, when we set the default quotas for produce and fetch.
I do not think we care. The user deploys them separately.
- I'm not sure if User Operator is affected, as the quotas should not be (but maybe I'm wrong) applied to creation of the users and managing additional quotas.
It manages SCRAM-SHA users, ACLs and quotas. Does the mutation rate apply to that as well? Or is it only topics?
It manages SCRAM-SHA users, ACLs and quotas. Does the mutation rate apply to that as well? Or is it only topics?
From what I read, it is only topics.
So, what do we consider the minimal produce / fetch limit for Cruise Control to work?
I don't know .. @kyguy do you have an idea?
Discussed on the community call on 5.9.2024: This should be documented as a warning for the users. We should make it clear:
That Cruise Control and Cruise Control metrics reporter need some minimal values to work properly (@kyguy will try to provide some values needed by Cruise Control)
Sorry I dropped the ball on this, let me do some calculations and provide an estimate for this tomorrow
That Cruise Control and Cruise Control metrics reporter need some minimal values to work properly (@kyguy will try to provide some values needed by Cruise Control)
Apologies for the delay, here are some minimal produce/fetch limits (producer_byte_rate
/consumer_byte_rate
) for Cruise Control producer/consumers that should suffice for small clusters with default Cruise Control configurations
CruiseControlMetricsReporter
(producer) - 1 kB/sKafkaCruiseControlSampleStoreProducer
(producer) - 1kB/sCruiseControlMetricsReporterSampler-consumer
(consumer) - 1 kB/s
When the default
kafka
quotas plugin is configured inside the.spec.kafka.quotas
section of the Kafka resource, the quotas are applied to all of the users - as a default quotas. That means that they are applied also to the internal users, which can hit the quotas - for example when we set the controller mutation rate quota, the Topic operator can hit it during some of its operations.For the
strimzi
quotas plugin type, this is handled using the "excluded principals" option of the plugin, where we are adding the internal users together with those specified inside the.spec.kafka.quotas
section of the Kafka resource - so they are all excluded from the quotas.But for the default Kafka quotas plugin, there is not such option that we can use.
To solve this, we can configure quotas to
null
values for the internal users, when the default quotas are configured in the Kafka resource. However, this is not that easy, as the quotas will be removed by User operator when they are created. Also, the information about the internal users would be accessible via the Kafka Admin API. This would not be trivial and it would require proposal to cover all the involved components that would need changes (Cluster operator, User operator, ...), together with the whole approach.Another option is to document this inside our documentation - as it is maybe desired to limit the internal users as well. This would be the most simple way, but in the other hand it can cause issues - for example when someone would like to limit all other users, but keeping the TO and other components and their users without limitations.
We should discuss how to proceed with this or if there are other options that we should take into account.