Open manuelbonk opened 3 years ago
Thanks for bringing this up, I am sorry your are having this problems.
With latest I changed a bit the way the JVM is done in the docker image, to basically make it smaller I might be some missing stuff if required in java.
I tested right now master code with the docker image locally under docker/tls and I have notice no errors, but I will keep testing tomorrow.
Looking forward to sort this out.
Context, Admin client usage is really simple here, all configs goes there, so there should be no difference.
Keep you posted
Hi @manuelbonk. I finally took sometime to run a couple of test in this report. What i did is this.
This is what I get:
docker run -t -i \
-v /Users/pere/work/gitops/kafka-topology-builder/example:/example \
-v /Users/pere/work/gitops/kafka-topology-builder/docker:/docker \
--network tls_default \
purbon/kafka-topology-builder:latest \
julie-ops-cli.sh \
--brokers kafka.confluent.local:9093 \
--clientConfig /example/topology-builder-tls.properties --topology /example/descriptor.yaml
[INFO ] 2021-03-27 17:32:28.401 [main] TopologyBuilderAdminClientBuilder - Connecting AdminClient to kafka.confluent.local:9093
[INFO ] 2021-03-27 17:32:28.401 [main] TopologyBuilderAdminClientBuilder - Connecting AdminClient to kafka.confluent.local:9093
log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.admin.AdminClientConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
List of Topics:
context.company.env.source.projectC.topicE
context.company.env.source.projectC.topicF
context.company.env.source.projectA.bar.avro
.... redacted....
_confluent-monitoring
'TOPIC', '_confluent-monitoring', '*', 'CREATE', 'User:ControlCenter', 'LITERAL'
'TOPIC', '_confluent-monitoring', '*', 'WRITE', 'User:ControlCenter', 'LITERAL'
'TOPIC', '_confluent-monitoring', '*', 'DESCRIBE', 'User:ControlCenter', 'LITERAL'
'TOPIC', '_confluent-monitoring', '*', 'READ', 'User:ControlCenter', 'LITERAL'
status
'TOPIC', 'status', '*', 'WRITE', 'User:Connect1', 'LITERAL'
'TOPIC', 'status', '*', 'READ', 'User:Connect1', 'LITERAL'
List of Principles:
Kafka Topology updated
you can see i used latest as in your report.
This is the property file I used:
/example/topology-builder-tls.properties
security.protocol=SSL
ssl.truststore.location=/docker/tls/certs/truststore.jks
ssl.truststore.type=PKCS12
ssl.truststore.password=test1234
ssl.keystore.location=/docker/tls/certs/server.keystore.jks
ssl.keystore.password=test1234
ssl.keystore.type=PKCS12
ssl.key.password=test1234
ssl.endpoint.identification.algorithm=
this file is available in the running container and used to access and run the commands.
As you can see from
nearly every config given in then passed to the adminClient.
I wonder,
Caused by: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
this reference something arrived unrecognized during the SSL handshake trial.
Do you mind replicating and validating my tests here? I would love to help findings the small needle in your situation.
Either in JulieOps as in Kafka, I would suggest turning on the LOG debug mode. This would help to debug the problem.
Hi Pere, thanks for your reply. I've set the properties file just like your's:
security.protocol=SSL
ssl.keystore.location=/etc/kafka/secrets/kafka.keystore.jks
ssl.keystore.password=XXX
ssl.truststore.location=/etc/kafka/secrets/kafka.truststore.jks
ssl.truststore.password=XXX
ssl.endpoint.identification.algorithm=
bootstrap.servers=100.127.129.155:9093,100.127.129.156:9093,100.127.129.157:9093,100.127.129.158:9093
ssl.keystore.type=PKCS12
ssl.truststore.type=PKCS12
How can I enable the debug log for Julie? I couldn't find any info on that in the docu. I'm only passing the client config and the topology dir, still the same result:
bash-5.1$ julie-ops-cli.sh --clientConfig /tmp/ktb.properties --topology /tmp/ansible.u353lgc6
[INFO ] 2021-03-29 17:02:03.832 [main] TopologyBuilderAdminClientBuilder - Connecting AdminClient to 100.127.129.155:9093,100.127.129.156:9093,100.127.129.157:9093,100.127.129.158:9093
[INFO ] 2021-03-29 17:02:03.832 [main] TopologyBuilderAdminClientBuilder - Connecting AdminClient to 100.127.129.155:9093,100.127.129.156:9093,100.127.129.157:9093,100.127.129.158:9093
log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.admin.AdminClientConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.io.IOException: Problem during the health-check operation
at com.purbon.kafka.topology.api.adminclient.TopologyBuilderAdminClient.healthCheck(TopologyBuilderAdminClient.java:62)
at com.purbon.kafka.topology.api.adminclient.TopologyBuilderAdminClientBuilder.build(TopologyBuilderAdminClientBuilder.java:30)
at com.purbon.kafka.topology.JulieOps.build(JulieOps.java:70)
at com.purbon.kafka.topology.CommandLineInterface.processTopology(CommandLineInterface.java:195)
at com.purbon.kafka.topology.CommandLineInterface.run(CommandLineInterface.java:144)
at com.purbon.kafka.topology.CommandLineInterface.main(CommandLineInterface.java:134)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
at com.purbon.kafka.topology.api.adminclient.TopologyBuilderAdminClient.healthCheck(TopologyBuilderAdminClient.java:60)
... 5 more
Caused by: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
at java.base/sun.security.ssl.SSLEngineInputRecord.bytesInCompletePacket(Unknown Source)
at java.base/sun.security.ssl.SSLEngineInputRecord.bytesInCompletePacket(Unknown Source)
at java.base/sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source)
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source)
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source)
at java.base/javax.net.ssl.SSLEngine.unwrap(Unknown Source)
at org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:509)
at org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:368)
at org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:291)
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:173)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)
at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:561)
at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1329)
at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1260)
at java.base/java.lang.Thread.run(Unknown Source)
Yes, I'm accessing the TLS port, see Additional context
in my initial post where I get a response with openssl s_client
.
I have a guess, do you mind trying something for me? can you try again but:
I have the feeling it might be the parsing of this that gets "funny".
Let me know if you can try this. It will be highly helpful.
-- Pere
I've tried it without the bootstrap.servers
in the file and by passing only one broker, still the same issue.
I've used the latest version of the julie docker image and also tried an old version of kafka-topology-builder. Both were running on the same K8s cluster, against the same Kafka cluster with the exact same topology files, properties files and credentials. The former failed, the latter succeeded...
I've also tried it on different kafka clusters, still the same error.
I'm running out of ideas... Are there any means to increase the verbosity of julie?
I think so, this is how I solve any TLS connection when implying the use of external clients, such as https.
Missatge de Fobhep @.***> del dia dt., 11 de maig 2021 a les 16:48:
could this possibly be "solved" by exporting trusstore (and maybe) keystore via JULIE_OPS_OPTIONS?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kafka-ops/julie/issues/240#issuecomment-838608464, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAQXPFK7ALD7OO55NDX6NLTNE7TVANCNFSM4ZZQBD6A .
-- Pere Urbon-Bayes Software Architect https://twitter.com/purbon https://www.linkedin.com/in/purbon/
I seem to only be able to replicate this on the Alpine Linux version of purbon/kafka-topology-builder created from the nightly build. This may have something to do with your comment:
With latest I changed a bit the way the JVM is done in the docker image, to basically make it smaller I might be some missing stuff if required in java.
The strangest part is that I can rerun the same command multiple times in a row without changing anything and around ~60% of the time it will throw the following error, and the rest of the time it works as expected. Maybe it's some kind of race condition?
Exception in thread "main" java.io.IOException: Problem during the health-check operation
...
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
...
Caused by: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
...
at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1289)
at java.base/java.lang.Thread.run(Unknown Source)
I'm using Confluent Cloud, and my config is like so:
test-environment.properties
:
bootstrap.servers=***
schema.registry.url=***
test-environment-override.properties
:
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="***" password="***";
basic.auth.credentials.source=USER_INFO
schema.registry.basic.auth.user.info=***:***
allow.delete.topics=true
topology.features.experimental=true
allow.delete.principals=false
allow.delete.bindings=true
topology.service.accounts.managed.prefixes.0=Service
topology.translation.principal.enabled=true
ccloud.environment=***
topology.state.cluster.enabled=true
output.yml
:
---
# Source: kafka-topology/templates/topology.yaml
context: env
instance: test-environment
projects:
- name: ***
topics:
- name: "***"
producers:
- principal: "Service:***"
consumers:
- principal: "Service:***"
group: "****"
schemas:
value.schema.file: schemas/***.avsc
value.format: "AVRO"
value.compatibility: "FULL_TRANSITIVE"
config:
replication.factor: "3"
num.partitions: "1"
confluent.key.schema.validation: true
confluent.value.schema.validation: true
command:
julie-ops-cli.sh --clientConfig ./JulieOps/test-environment.properties --overridingClientConfig ./JulieOps/test-environment-override.properties --topology ./JulieOps/output.yml
Describe the bug I've created a kafka admin client properties file. This file works perfectly with
kafka-topics
and the other kafka cli tools. If I pass julie the very same properties file the SSL handshake fails.To Reproduce Steps to reproduce the behavior:
kafka-topics
:Expected behavior julie successfully connects to the kafka cluster.
Runtime (please complete the following information):
Additional context Connecting to the brokers with
openssl s_client
works. The handshake fails as the operating system doesn't trust the self signed CA. This shouldn't be an issue for julie as julie should use the truststore specified in the admin config file containing the CA.