confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.2k stars 1.11k forks source link

Schema-Registry doesn't connect to SSL kafka cluster #1792

Open artemisb22 opened 3 years ago

artemisb22 commented 3 years ago

I'm trying to deploy the latest schema-registry docker image on k8s and trying to connect it on a SSL secured kafka cluster. I'm mounting the CA and user cert and defining parameters as ENVs:

      - env:
        - name: SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL
          value: SSL
        - name: SCHEMA_REGISTRY_KAFKASTORE_SSL_TRUSTSTORE_LOCATION
          value: /var/certs/user-truststore.jks
        - name: SCHEMA_REGISTRY_KAFKASTORE_SSL_TRUSTSTORE_PASSWORD
          value: <hidden>
        - name: SCHEMA_REGISTRY_KAFKASTORE_SSL_KEYSTORE_LOCATION
          value: /var/certs/user-keystore.p12
        - name: SCHEMA_REGISTRY_KAFKASTORE_SSL_KEYSTORE_PASSWORD
          value: <hidden>
        - name: SCHEMA_REGISTRY_KAFKASTORE_SSL_KEY_PASSWORD
          value: <hidden>
        - name: SCHEMA_REGISTRY_LISTENERS
          value: http://0.0.0.0:8081
        - name: SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS
          value: SSL://kafka-bootstrap:9093
        - name: SCHEMA_REGISTRY_KAFKASTORE_GROUP_ID
          value: schema-registry
        - name: SCHEMA_REGISTRY_MASTER_ELIGIBILITY
          value: "true"
        - name: SCHEMA_REGISTRY_KAFKASTORE_SSL_KEYSTORE_TYPE
          value: PKCS12
        - name: SCHEMA_REGISTRY_HOST_NAME
          value: cp-schema-registry-server
        - name: SCHEMA_REGISTRY_HEAP_OPTS
          value: -Xms512M -Xmx512M
        - name: JMX_PORT
          value: "5555"

As seen here from the container logs it is not complaining about them:

===> User
uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
===> Configuring ...
===> Running preflight checks ...
===> Check if Kafka is healthy ...
[main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values:
        bootstrap.servers = [SSL://kafka-bootstrap:9093]
        client.dns.lookup = use_all_dns_ips
        client.id =
        connections.max.idle.ms = 300000
        default.api.timeout.ms = 60000
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        receive.buffer.bytes = 65536
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 30000
        retries = 2147483647
        retry.backoff.ms = 100
        sasl.client.callback.handler.class = null
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.login.callback.handler.class = null
        sasl.login.class = null
        sasl.login.refresh.buffer.seconds = 300
        sasl.login.refresh.min.period.seconds = 60
        sasl.login.refresh.window.factor = 0.8
        sasl.login.refresh.window.jitter = 0.05
        sasl.mechanism = GSSAPI
        security.protocol = SSL
        security.providers = null
        send.buffer.bytes = 131072
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
        ssl.endpoint.identification.algorithm = https
        ssl.engine.factory.class = null
        ssl.key.password = [hidden]
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = /var/certs/user-keystore.p12
        ssl.keystore.password = [hidden]
        ssl.keystore.type = PKCS12
        ssl.protocol = TLSv1.3
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = /var/certs/user-truststore.jks
        ssl.truststore.password = [hidden]
        ssl.truststore.type = JKS

[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'ssl.keystore.type' was supplied but isn't a known config.
[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'ssl.truststore.location' was supplied but isn't a known config.
[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'ssl.keystore.password' was supplied but isn't a known config.
[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'ssl.key.password' was supplied but isn't a known config.
[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'group.id' was supplied but isn't a known config.
[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'ssl.keystore.location' was supplied but isn't a known config.
[main] WARN org.apache.kafka.clients.admin.AdminClientConfig - The configuration 'ssl.truststore.password' was supplied but isn't a known config.
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 6.0.2-ccs
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: a58736d0602d24aa
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1615214096609
===> Launching ...
===> Launching schema-registry ...
[2021-03-08 14:35:34,106] INFO SchemaRegistryConfig values:
        access.control.allow.headers =
        access.control.allow.methods =
        access.control.allow.origin =
        access.control.skip.options = true
        authentication.method = NONE
        authentication.realm =
        authentication.roles = [*]
        authentication.skip.paths = []
        avro.compatibility.level =
        compression.enable = true
        csrf.prevention.enable = false
        csrf.prevention.token.endpoint = /csrf
        csrf.prevention.token.expiration.minutes = 30
        csrf.prevention.token.max.entries = 10000
        debug = false
        host.name = cp-schema-registry-server
        idle.timeout.ms = 30000
        inter.instance.headers.whitelist = []
        inter.instance.protocol = http
        kafkastore.bootstrap.servers = [SSL://kafka-bootstrap:9093]
        kafkastore.connection.url =
        kafkastore.group.id = schema-registry
        kafkastore.init.timeout.ms = 60000
        kafkastore.sasl.kerberos.kinit.cmd = /usr/bin/kinit
        kafkastore.sasl.kerberos.min.time.before.relogin = 60000
        kafkastore.sasl.kerberos.service.name =
        kafkastore.sasl.kerberos.ticket.renew.jitter = 0.05
        kafkastore.sasl.kerberos.ticket.renew.window.factor = 0.8
        kafkastore.sasl.mechanism = GSSAPI
        kafkastore.security.protocol = SSL
        kafkastore.ssl.cipher.suites =
        kafkastore.ssl.enabled.protocols = TLSv1.2,TLSv1.1,TLSv1
        kafkastore.ssl.endpoint.identification.algorithm =
        kafkastore.ssl.key.password = [hidden]
        kafkastore.ssl.keymanager.algorithm = SunX509
        kafkastore.ssl.keystore.location = /var/certs/user-keystore.p12
        kafkastore.ssl.keystore.password = [hidden]
        kafkastore.ssl.keystore.type = PKCS12
        kafkastore.ssl.protocol = TLS
        kafkastore.ssl.provider =
        kafkastore.ssl.trustmanager.algorithm = PKIX
        kafkastore.ssl.truststore.location = /var/certs/user-truststore.jks
        kafkastore.ssl.truststore.password = [hidden]
        kafkastore.ssl.truststore.type = JKS
        kafkastore.timeout.ms = 500
        kafkastore.topic = _schemas
        kafkastore.topic.replication.factor = 3
        kafkastore.update.handlers = []
        kafkastore.write.max.retries = 5
        kafkastore.zk.session.timeout.ms = 30000
        leader.eligibility = true
        listeners = [http://0.0.0.0:8081]
        master.eligibility = true
        metric.reporters = []
        metrics.jmx.prefix = kafka.schema.registry
        metrics.num.samples = 2
        metrics.sample.window.ms = 30000
        metrics.tag.map = []
        mode.mutability = false
        port = 8081
        request.logger.name = io.confluent.rest-utils.requests
        request.queue.capacity = 2147483647
        request.queue.capacity.growby = 64
        request.queue.capacity.init = 128
        resource.extension.class = []
        resource.extension.classes = []
        resource.static.locations = []
        response.http.headers.config =
        response.mediatype.default = application/vnd.schemaregistry.v1+json
        response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
        rest.servlet.initializor.classes = []
        schema.compatibility.level = backward
        schema.providers = []
        schema.registry.group.id = schema-registry
        schema.registry.inter.instance.protocol =
        schema.registry.resource.extension.class = []
        schema.registry.zk.namespace = schema_registry
        shutdown.graceful.ms = 1000
        ssl.cipher.suites = []
        ssl.client.auth = false
        ssl.client.authentication = NONE
        ssl.enabled.protocols = []
        ssl.endpoint.identification.algorithm = null
        ssl.key.password = [hidden]
        ssl.keymanager.algorithm =
        ssl.keystore.location =
        ssl.keystore.password = [hidden]
        ssl.keystore.reload = false
        ssl.keystore.type = JKS
        ssl.keystore.watch.location =
        ssl.protocol = TLS
        ssl.provider =
        ssl.trustmanager.algorithm =
        ssl.truststore.location =
        ssl.truststore.password = [hidden]
        ssl.truststore.type = JKS
        thread.pool.max = 200
        thread.pool.min = 8
        websocket.path.prefix = /ws
        websocket.servlet.initializor.classes = []
        zookeeper.set.acl = false
 (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig)
[2021-03-08 14:35:35,709] INFO Logging initialized @19601ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2021-03-08 14:35:36,106] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2021-03-08 14:35:39,009] INFO Adding listener: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2021-03-08 14:35:46,908] INFO AdminClientConfig values:
        bootstrap.servers = [SSL://kafka-bootstrap:9093]
        client.dns.lookup = use_all_dns_ips
        client.id =
        connections.max.idle.ms = 300000
        default.api.timeout.ms = 60000
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        receive.buffer.bytes = 65536
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 30000
        retries = 2147483647
        retry.backoff.ms = 100
        sasl.client.callback.handler.class = null
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.login.callback.handler.class = null
        sasl.login.class = null
        sasl.login.refresh.buffer.seconds = 300
        sasl.login.refresh.min.period.seconds = 60
        sasl.login.refresh.window.factor = 0.8
        sasl.login.refresh.window.jitter = 0.05
        sasl.mechanism = GSSAPI
        security.protocol = SSL
        security.providers = null
        send.buffer.bytes = 131072
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
        ssl.endpoint.identification.algorithm = https
        ssl.engine.factory.class = null
        ssl.key.password = [hidden]
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = /var/certs/user-keystore.p12
        ssl.keystore.password = [hidden]
        ssl.keystore.type = PKCS12
        ssl.protocol = TLSv1.3
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = /var/certs/user-truststore.jks
        ssl.truststore.password = [hidden]
        ssl.truststore.type = JKS

Still getting this error:

[2021-03-08 14:36:17,306] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryException: Failed to get Kafka cluster ID
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1228)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:156)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:69)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:88)
        at io.confluent.rest.Application.configureHandler(Application.java:255)
        at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:196)
        at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: java.util.concurrent.TimeoutException
        at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108)
        at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1226)
        ... 7 more
justpolidor commented 3 years ago

Interesting... did you get any updates on that ? I have the same problem on a Kafka Strimzi setup with Authorization enabled.

cortopy commented 3 years ago

I'm also having this problem but with PLAINTEXT. Increasing resources seems to work at least for now. It's an intermittent issue for me though

justpolidor commented 3 years ago

@cortopy thanks to your comment we removed resources limits on our deployment and it started to work!

JigeeshaJain commented 2 years ago

I'm also having this problem but with PLAINTEXT. Increasing resources seems to work at least for now. It's an intermittent issue for me though

How did you increase the number of resources. I am getting the error which says failed to get the kafka cluster id

Screen Shot 2022-07-06 at 12 37 20 PM
juresaht2 commented 1 year ago

The keystores listed here as blank -- this does not turn out to be important. The settings from the configuration files are used anyway.

In my case they did not because I failed to restart the correct service... the broker systemd service is called confluent-kafka not confluent-kafka-broker, which brakes the pattern from all other kafka services.