confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.19k stars 1.11k forks source link

Connect Schema Registry to AWS MSK with IAM enabled. #1898

Open smasilamani-cfins opened 3 years ago

smasilamani-cfins commented 3 years ago

Hello

I created MSK cluster with IAM enabled and I am trying to run Schema Registry in AWS ECS Fargate however I am getting below error. It looks like the aws jar file is not available in the classpath. I tried to copy the jar file into /etc/share/java folder directly and also under aws-iam-auth folder containing the jar file as the only content. No matter what I do, I always get the below error. Please advise.

Here is the link for AWS client config : AWS MSK Client config

[2021-06-04 20:55:22,787] ERROR Server died unexpectedly: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain)

org.apache.kafka.common.config.ConfigException: Invalid value software.amazon.msk.auth.iam.IAMClientCallbackHandler for configuration sasl.client.callback.handler.class: Class software.amazon.msk.auth.iam.IAMClientCallbackHandler could not be found. at org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:757) at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:503) at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:496) at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:108) at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:142) at org.apache.kafka.clients.admin.AdminClientConfig.<init>(AdminClientConfig.java:233) at org.apache.kafka.clients.admin.Admin.create(Admin.java:65) at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1282) at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:158) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:69) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:88) at io.confluent.rest.Application.configureHandler(Application.java:255) at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:227) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)

smasilamani-cfins commented 3 years ago

I was able to fix the error by using my own image and copy the files in these places.Similarly I need do the same thing for Kafka connect too but the third copy command should copy the file to the kafka folder under /usr/share/java. But I am wondering if it is possible to add extra jars to the class path of schema-registy/connect and other tools at run time like using ENV variable at the docker run command without extending the image.

FROM confluentinc/cp-schema-registry
USER root
COPY aws-msk-iam-auth-1.0.0 /usr/share/java
COPY aws-msk-iam-auth-1.0.0/*.jar /usr/share/java/cp-base-new
COPY aws-msk-iam-auth-1.0.0/*.jar /usr/share/java/schema-registry
USER appuser
CMD ["sh","-c","export SCHEMA_REGISTRY_HOST_NAME=$HOSTNAME;export SCHEMA_REGISTRY_LISTENERS=http://$HOSTNAME:8081;/etc/confluent/docker/run"]
OneCricketeer commented 3 years ago

possible to add extra jars to the class path of schema-registy/connect and other tools at run time

Not without overriding the entrypoint/command script to do so, but that's not recommended as you'd be making new requests every time you started the container (for example, an orchestrator reschedules the container), rather than have the JARs cached in an image layer

creed123 commented 3 years ago

Hi @saachinsiva Where is the configuration to pass the required client.properties to kafka properties?

OneCricketeer commented 3 years ago

The documentation shows you what properties to use

https://github.com/aws-samples/amazon-msk-client-authentication#to-use-a-client-with-tls-mutual-authentication-with-an-amazon-msk-cluster

jsnb-devoted commented 2 years ago

I'm trying to run something similar except I'm hoping to run the schema registry in EKS and I'm using the cp-helm-charts repo to deploy. The Docker excerpt above was a good start but I still haven't been able to get it up and running.

I have a multistage Docker file that downloads the aws-msk-iam-auth-1.0.0-all.jar from github and copies it to the schema-registry image:

FROM alpine:latest as builder
WORKDIR /app/
RUN apk update && apk upgrade
RUN apk update && \
    apk add curl

RUN curl -sL https://github.com/aws/aws-msk-iam-auth/releases/download/1.0.0/aws-msk-iam-auth-1.0.0-all.jar > aws-msk-iam-auth-1.0.0-all.jar

FROM confluentinc/cp-schema-registry:7.0.1
USER root
COPY --from=builder /app/aws-msk-iam-auth-1.0.0-all.jar /usr/share/java
COPY --from=builder /app/aws-msk-iam-auth-1.0.0-all.jar /usr/share/java/cp-base-new
COPY --from=builder /app/aws-msk-iam-auth-1.0.0-all.jar /usr/share/java/schema-registry

USER appuser
CMD ["sh","-c","export SCHEMA_REGISTRY_HOST_NAME=$HOSTNAME;export SCHEMA_REGISTRY_LISTENERS=http://$HOSTNAME:8081;/etc/confluent/docker/run"]

This is slightly different than what @saachinsiva detailed above. I'm not sure why they had COPY aws-msk-iam-auth-1.0.0/*.jar since I can only find the single jar file rather than a directory with a series of jar files.

Right off the bat I get errors about the slf4j configuration:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-simple-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (io.confluent.admin.utils.cli.KafkaReadyCommand).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

I can suppress the multiple bindings issue by just removing one of them in the dockerfile but it feels pretty hacky:

RUN rm /usr/share/java/cp-base-new/slf4j-simple-1.7.30.jar

As for the No appenders could be found for logger (io.confluent.admin.utils.cli.KafkaReadyCommand). error I have no solutions. I've tried copying log4j.properties files into different places and it all results in the same issue. Since the logger isn't configured properly I have no way of diagnosing the actual issue. The logs just hang there until the pod throws a silent error and k8s terminates it.

OneCricketeer commented 2 years ago

@jsnb-devoted You shouldn't need an alpine container/stage just to use curl.

Also, you should be able to run your container in any environment, including locally, to debug any log4j startup issues. If you do figure them out outside of AWS, then you should expect to see logs for IAM errors, for example

At the very least, you should try setting SCHEMA_REGISTRY_LOG4J_LOGGERS variable

tonyqiu2020 commented 2 years ago

@jsnb-devoted I have exact same error about logs, do have any working solution yet?

jsnb-devoted commented 2 years ago

@jsnb-devoted I have exact same error about logs, do have any working solution yet?

@tonyqiu2020 I was able to resolve multiple SLF4J binding warning by adding: RUN rm /usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar to the docker image

I think @OneCricketeer was right about my larger issue being the SCHEMA_REGISTRY_LOG4J_LOGGERS variable. I'm using the helm chart so I added this to the chart:

  configurationOverrides:
    "log4j_loggers": "org.apache.kafka=ERROR,io.confluent.rest.exceptions=FATAL"

If you aren't using the helm chart I think you just want to set that environment variable however you are doing that in your deployment:

SCHEMA_REGISTRY_LOG4J_LOGGERS="org.apache.kafka=ERROR,io.confluent.rest.exceptions=FATAL"
ekeric13 commented 2 years ago

Using the schema registry image 7.1.1 and hitting a similar error log with a slightly different stack trace:

org.apache.kafka.common.config.ConfigException: Invalid value "software.amazon.msk.auth.iam.IAMClientCallbackHandler" for configuration sasl.client.callback.handler.class: Class "software.amazon.msk.auth.iam.IAMClientCallbackHandler" could not be found.
    at org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:744)
    at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:490)
    at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:483)

Based on the above comments I did

USER root
RUN wget -O /usr/share/java/aws-msk-iam-auth.jar https://github.com/aws/aws-msk-iam-auth/releases/download/v1.1.1/aws-msk-iam-auth-1.1.1-all.jar
RUN cp /usr/share/java/aws-msk-iam-auth.jar /usr/share/java/cp-base-new/
RUN cp /usr/share/java/aws-msk-iam-auth.jar /usr/share/java/schema-registry/
RUN rm /usr/share/java/aws-msk-iam-auth.jar
USER appuser

And added when adding env variables I was able to get it working

KAFKASTORE_SECURITY_PROTOCOL=SASL_SSL
KAFKASTORE_SASL_MECHANISM=AWS_MSK_IAM
KAFKASTORE_SASL_JAAS_CONFIG=software.amazon.msk.auth.iam.IAMLoginModule required;
KAFKASTORE_SASL_CLIENT_CALLBACK_HANDLER_CLASS=software.amazon.msk.auth.iam.IAMClientCallbackHandler

How come we need to add aws-msk-iam-auth-1.0.0-all.jar to multiple paths and not just in /usr/share/java/cp-base-new/? From the docs it seemed like we need to add it to only in the CLASSPATH. I noticed schema-registry has CUB_CLASSPATH="/usr/share/java/cp-base-new/*" but even manually setting that env variable, schema-registry still wouldn't correctly boot. Don't quite understand how schema-registry finds and uses the jar file.

ekeric13 commented 2 years ago

Also curious what is the least amount of permissions you were able to use and get it working. This is what I got but the process has been very trial and error and I am wondering if there can be less:

    actions = [
      "kafka-cluster:Connect",
      "kafka-cluster:DescribeCluster",
      "kafka-cluster:DescribeClusterDynamicConfiguration",
      "kafka-cluster:DescribeTopic",
      "kafka-cluster:DescribeTopicDynamicConfiguration",
      "kafka-cluster:WriteData",
      "kafka-cluster:AlterTopic",
      "kafka-cluster:DescribeGroup",
      "kafka-cluster:AlterGroup",
      "kafka-cluster:ReadData"
    ]
    resources = [
      "arn:aws:kafka:region:account-id:cluster/cluster-name/*",
      "arn:aws:kafka:region:account-id:topic/cluster-name/*/_schemas",
      "arn:aws:kafka:region:account-id:group/cluster-name/*/schema-registry"
    ]
miguellgramacho96 commented 2 years ago

Aside from workarounds, can we get a word from Confluent on whether there's an out of the box configuration for this or is it part of the roadmap to add it as a feature?

sebiwi commented 1 year ago

@ekeric13 I think you're missing kafka-cluster:CreateTopic on the actions list.

ekeric13 commented 1 year ago

@sebiwi Would I still need that if I already created the schemas topic myself? Or is that action used for other things?

sebiwi commented 1 year ago

@ekeric13 no, that action is used to create the _schemas topic (or whatever you set as <kafkastore.topic>), and is not used afterwards AFAIK.

pecigonzalo commented 1 year ago

This is what worked for me:

# Inject AWS IAM Auth to Confluent image
FROM confluentinc/cp-schema-registry:7.2.2

USER root
ADD --chown=appuser:appuser https://github.com/aws/aws-msk-iam-auth/releases/download/v1.1.4/aws-msk-iam-auth-1.1.4-all.jar /usr/share/java/cp-base-new/
ADD --chown=appuser:appuser https://github.com/aws/aws-msk-iam-auth/releases/download/v1.1.4/aws-msk-iam-auth-1.1.4-all.jar /usr/share/java/schema-registry/

ENV SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL="SASL_SSL"
ENV SCHEMA_REGISTRY_KAFKASTORE_SASL_MECHANISM="AWS_MSK_IAM"
ENV SCHEMA_REGISTRY_KAFKASTORE_SASL_JAAS_CONFIG="software.amazon.msk.auth.iam.IAMLoginModule required;"
ENV SCHEMA_REGISTRY_KAFKASTORE_SASL_CLIENT_CALLBACK_HANDLER_CLASS="software.amazon.msk.auth.iam.IAMClientCallbackHandler"

USER appuser

Its important to add with chown otherwise its not recognized. You need to add it to both places as one is used by the Kafka admin connection and the other by the schema registry process (why they are different, I have no clue).

hongbo-miao commented 1 month ago

To address @smasilamani-cfins, I confirmed that simply mounting the jar file at runtime doesn't work. This might be due to the file ownership issue, as @pecigonzalo mentioned.

Here's my full deployment code. Hope it helps save time for others who comes here in the future ☺️