kafka-ops / julie

A solution to help you build automation and gitops in your Apache Kafka deployments. The Kafka gitops!
MIT License
419 stars 114 forks source link

Wrong ACL for default consumer group #111

Closed akselh closed 3 years ago

akselh commented 3 years ago

We have a problem with our consumer client failing to read data from a topic when ACL are produced by KTB 1.0.0-rc2 (or using master). After investigation we found that the ACL for the consumer group, when no specific group is given in topology, is wrong. This ACL is produced:

'GROUP', '*', '*', 'READ', 'User:someuser', 'PREFIXED'

It should have been:

'GROUP', '*', '*', 'READ', 'User:someuser', 'LITERAL'

The problem seems to be here: https://github.com/kafka-ops/kafka-topology-builder/commit/0cbc3e5b4af8c45c3cbb5a79497c83d74b4bfe37#diff-79dd00e40b697136c305f5c944c3cc75689819a4a27fce81d781e774eae29c8aR235

purbon commented 3 years ago

@akselh quick question to clarify. why do you think it should be literal? with literal it will look for group named *, however with prefix it will actually accept all groups.

purbon commented 3 years ago

From confluent docs:

Prefixed ACLs
You can specify ACL resources using either a LITERAL value (default), PREFIXED pattern type, or wildcard (*), which allows all.

If you identify the resource using LITERAL, Kafka will try to match the full resource name (for example, topic or consumer group) with the resource specified in the ACL. In some cases you may want to use an asterisk (*) to specify all resources.

If you identify the resource using PREFIXED, Kafka will try to match the prefix of the resource name with the resource specified in ACL.

ref: https://docs.confluent.io/current/kafka/authorization.html#prefixed-acls

I am I missing anything?

sverrehu commented 3 years ago

From ResourcePatternFilter.java in the Kafka source:

        switch (pattern.patternType()) {
            case LITERAL:
                return name.equals(pattern.name()) || pattern.name().equals(WILDCARD_RESOURCE);

            case PREFIXED:
                return name.startsWith(pattern.name());

            default:
                throw new IllegalArgumentException("Unsupported PatternType: " + pattern.patternType());
        }

So it appears (at least in this class) that the wildcard will only be matched as a wildcard if the pattern is LITERAL.

Also, in PatternType.java, the use of wildcard is described as part of the LITERAL javadoc:

    /**
     * A literal resource name.
     *
     * A literal name defines the full name of a resource, e.g. topic with name 'foo', or group with name 'bob'.
     *
     * The special wildcard character {@code *} can be used to represent a resource with any name.
     */
    LITERAL((byte) 3),
akselh commented 3 years ago

If you identify the resource using LITERAL, Kafka will try to match the full resource name (for example, topic or consumer group) with the resource specified in the ACL. In some cases *_you may want to use an asterisk () to specify all resources_**.

Ref your quote from docs: It says exactly the same as I am reporting. You need LITERAL for asterisk/any resource.

akselh commented 3 years ago

@akselh quick question to clarify. why do you think it should be literal? with literal it will look for group named *, however with prefix it will actually accept all groups.

I think it should be literal because:

  1. I tested it with a real Kafka cluser. Consumer fails with AuthorizationException when ACL for consumer has this ACL: 'GROUP', '*', '*', 'READ', 'User:someuser', 'PREFIXED'. When I manually add the ACL 'GROUP', '*', '*', 'READ', 'User:someuser', 'LITERAL' the client test succeeds.
  2. Ref the change done on the 27`th of August. The consumer group ACL was literal/* before this change: https://github.com/kafka-ops/kafka-topology-builder/pull/64 (also ref my link to the change above...)

This also makes me think that I would be good with an acceptance test suite for KTB; running KTB against a real Kakfa cluster and have tests for producer and consumer etc.

purbon commented 3 years ago

@akselh and @sverrehu thanks a lot for your comments and test! I was finally available to test this. Shame on me for introducing this error. Should be reverted back in the upcoming minutes in master. To be released as soon as I cut the 1.0

purbon commented 3 years ago

@akselh re tests, there is a small integration layer, but does not implement any consumer and producer logic. Would be fore sure nice to add it.

purbon commented 3 years ago

should be closed by d5df5f57a4ec3672396c9763ba65e781b6011b85