redpanda-data / kminion

KMinion is a feature-rich Prometheus exporter for Apache Kafka written in Go. It is lightweight and highly configurable so that it will meet your requirements.
MIT License
610 stars 122 forks source link

Kafka permission for kminion #188

Closed duyhieuvo closed 6 months ago

duyhieuvo commented 1 year ago

Hello,

could you help me clarify what is the right set of permission for KMinion on a secured Kafka cluster? In our case we have Confluent Platform and manage permission with Confluent predefined roles (https://docs.confluent.io/platform/current/security/rbac/rbac-predefined-roles.html#role-based-access-control-predefined-roles). And trying KMinion in both modes didn't work for us after we trying with different set of permissions:

AdminApi mode, after inspecting Kafka log, we see that Kminion tried to describe the consumer group and the cluster, so we gave it the following permission:

but then the Kminion pod crashed with the following logs:

│ panic: runtime error: invalid memory address or nil pointer dereference │ │ [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x7de84d] │ │ │ │ goroutine 2447 [running]: │ │ github.com/twmb/franz-go/pkg/kgo.(describeGroupsSharder).shard(0x400?, {0xcf0750?, 0xc0001b2c80?}, {0xcf4828?, 0xc00040a9f0}, {0x7f5da46d4fff?, 0x429ea5?}) │ │ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:2544 +0xa2d │ │ github.com/twmb/franz-go/pkg/kgo.(Client).handleShardedReq.func2({0x0, {0xcf4828, 0xc00040a9f0}, {0x0, 0x0}}) │ │ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:1730 +0x151 │ │ github.com/twmb/franz-go/pkg/kgo.(Client).handleShardedReq(0xc0002ba000, {0xcf0750?, 0xc0001b2c80}, {0xcf4828?, 0xc00040a9f0}) │ │ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:1821 +0x9f9 │ │ github.com/twmb/franz-go/pkg/kgo.(Client).shardedRequest(0xc0002ba000, {0xcf07f8?, 0xc000311410?}, {0xcf4828?, 0xc00040a9f0}) │ │ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:978 +0x691 │ │ github.com/twmb/franz-go/pkg/kgo.(Client).RequestSharded(0xc0004b2000?, {0xcf07f8?, 0xc000311410?}, {0xcf4828?, 0xc00040a9f0?}) │ │ /go/pkg/mod/github.com/twmb/franz-go@v1.10.0/pkg/kgo/client.go:909 +0x3a │ │ github.com/cloudhut/kminion/v2/minion.(Service).DescribeConsumerGroups(0xc0004b2000, {0xcf07f8, 0xc000311410}) │ │ /app/minion/describe_consumer_groups.go:70 +0x15b │ │ github.com/cloudhut/kminion/v2/prometheus.(Exporter).collectConsumerGroups(0xc000540000, {0xcf07f8?, 0xc000311410?}, 0xcea010?) │ │ /app/prometheus/collect_consumer_groups.go:18 +0x5e │ │ github.com/cloudhut/kminion/v2/prometheus.(Exporter).Collect(0xc000540000, 0xc00022a760?) │ │ /app/prometheus/exporter.go:235 +0x205 │ │ github.com/prometheus/client_golang/prometheus.(Registry).Gather.func1() │ │ /go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/registry.go:456 +0x10d │ │ created by github.com/prometheus/client_golang/prometheus.(Registry).Gather │ │ /go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/registry.go:548 +0xbac

in OffsetTopics mode, we gave it the permission to consume and describe config on the __consumer_offsets topic. But the consumer group lag info in the metrics seems to not be correct. It always shows 0 lag even though there are some.

It would be nice to have a summary of required permissions of KMinion on the Kafka cluster. Thank you

TheMeier commented 1 year ago

For me these ACLs work:

ACLs for principal `User:kminion`
Current ACLs for resource `ResourcePattern(resourceType=CLUSTER, name=kafka-cluster, patternType=LITERAL)`: 
    (principal=User:kminion, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
    (principal=User:kminion, host=*, operation=DESCRIBE, permissionType=ALLOW) 

Current ACLs for resource `ResourcePattern(resourceType=GROUP, name=*, patternType=LITERAL)`: 
    (principal=User:kminion, host=*, operation=READ, permissionType=ALLOW) 

Current ACLs for resource `ResourcePattern(resourceType=TOPIC, name=*, patternType=LITERAL)`: 
    (principal=User:kminion, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
    (principal=User:kminion, host=*, operation=DESCRIBE, permissionType=ALLOW) 

Current ACLs for resource `ResourcePattern(resourceType=TOPIC, name=__consumer_offsets, patternType=LITERAL)`: 
    (principal=User:kminion, host=*, operation=READ, permissionType=ALLOW)
    (principal=User:kminion, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
    (principal=User:kminion, host=*, operation=DESCRIBE, permissionType=ALLOW) 
weeco commented 6 months ago

The required ACLs are heavily dependent on your KMinion configuration, thus it's a bit harder to provide general guidance. Your posted panic should never happen, but I believe this was already fixed in franz-go.

I believe Confluent hides the consumer offsets topic in many of their product offerings so that this configuration is not an option for you. In that case you must use the Kafka API scrape mode and that requires permissions to run the DescribeGroups Kafka API command which will require the following ACLs:

e.g. if you configure Console to also delete it's previously created groups you also need to add Delete on Group with your configured group prefix if you want to constraint it further.