lsst-sqre / strimzi-registry-operator

A Kubernetes Operator for running the Confluent Schema Registry with a Strimzi-based Kafka cluster
MIT License
81 stars 17 forks source link

Construction of `client_secret` fails for `KafkaUser` configured with SASL/SCRAM #83

Open Infinnerty opened 1 year ago

Infinnerty commented 1 year ago

Hi, I'm having trouble deploying to a cluster with scram-sha-512 authentication.

It appears your code tries to construct a client_secret using the secret generated for a user.

[2022-11-18 18:04:15,550] kopf.objects         [INFO    ] [kafka-development/confluent-schema-registry] Retrieved cluster CA certificate
[2022-11-18 18:04:15,552] kopf.objects         [INFO    ] [kafka-development/confluent-schema-registry] Cluster CA certificate version: 11917675
[2022-11-18 18:04:15,566] kopf.objects         [INFO    ] [kafka-development/confluent-schema-registry] Retrieved cluster CA certificate
[2022-11-18 18:04:15,569] kopf.objects         [INFO    ] [kafka-development/confluent-schema-registry] Client certification version: 12627686
[2022-11-18 18:04:15,570] kopf.objects         [ERROR   ] [kafka-development/confluent-schema-registry] Handler 'create_registry' failed with an exception. Will retry.
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 279, in execute_handler_once
    result = await invoke_handler(
  File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 374, in invoke_handler
    result = await invocation.invoke(
  File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/invocation.py", line 139, in invoke
    await asyncio.shield(future)  # slightly expensive: creates tasks
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/venv/lib/python3.10/site-packages/strimziregistryoperator/handlers/createregistry.py", line 136, in create_registry
    secret = create_secret(
  File "/opt/venv/lib/python3.10/site-packages/strimziregistryoperator/certprocessor.py", line 82, in create_secret
    client_ca_cert = decode_secret_field(client_secret["data"]["ca.crt"])
KeyError: 'ca.crt'

This secret does not contain a ca.crt, or user.crt or user.key field. It looks like this:

{
  "password": "blah13434567",
  "sasl.jaas.config": "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"confluent-schema-registry\" password=\"blah13434567\";"
}

And is created using:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaUser
metadata:
  name: confluent-schema-registry
  namespace: kafka-${var.environment}
  labels:
    strimzi.io/cluster: data-cloud
spec:
  authentication:
    type: scram-sha-512

I think the code failing is here:

    if client_secret is None:
        client_secret = get_secret(
            namespace=namespace, name=kafka_username, k8s_client=k8s_client
        )
        logger.info("Retrieved cluster CA certificate")
    client_secret_version = client_secret["metadata"]["resourceVersion"]
    logger.info(f"Client certification version: {client_secret_version}")
    client_ca_cert = decode_secret_field(client_secret["data"]["ca.crt"])
    client_cert = decode_secret_field(client_secret["data"]["user.crt"])
    client_key = decode_secret_field(client_secret["data"]["user.key"])

Is this approach supported? I'm not using TLS on the listener I'd like the schema registry to use:

apiVersion: roundtable.lsst.codes/v1beta1
kind: StrimziSchemaRegistry
metadata:
  name: confluent-schema-registry
  namespace: kafka-${var.environment}
spec:
  strimziVersion: v1beta2
  listener: plain
  securityProtocol: SASL_PLAINTEXT
  compatibilityLevel: forward
  registryImage: confluentinc/cp-schema-registry
  registryImageTag: "7.2.1"
jonathansick commented 1 year ago

Thanks for the report on this. I don't think we've had anyone use SASL/SCRAM before for the registry. Sorry we missed this originally, but it makes sense to support it. We're in the middle of a few other things at Rubin Observatory so I'm not sure I can jump on this immediately, so PRs might be accepted otherwise we'll try to get it in the next few weeks.