strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.78k stars 1.28k forks source link

[Enabling Kafka using Cluster1 CA cert- SSL handshake failed error] ... #3281

Closed vperi1730 closed 4 years ago

vperi1730 commented 4 years ago

Hi Team,

I have a requirement where there are 2 clusters, Cluster1 and Cluster2. While launching Kafka inside Cluster2 I want to enable the Cluster1 ca crt in the YAML file so that the SSL handshake happens with that ca.crt of Cluster1 for both internal and external bootstrap services running on 9093 and 9094 ports.

My approach was like below in the Kafka CR. Is this the correct way to configure else please suggest a correct way??

listeners:
      external:
        authentication:
          type: tls
          certificates:
          - |
            -----BEGIN CERTIFICATE-----
            //This has Cluster1 ca crt content
            -----END CERTIFICATE-----
        overrides:
          bootstrap:
            address: kafka.dns.acl.com
        tls: true
        type: loadbalancer
      plain: {}
      tls:
        authentication:
          type: tls
          certificates:
          - |
            -----BEGIN CERTIFICATE-----
            //This has Cluster1 ca.crt
            -----END CERTIFICATE-----

After this, I have used the same ca.crt as truststore and my kafkauser as Keystore to initiate the producer.sh.

./bin/kafka-console-producer.sh --broker-list mm-backup-cluster-kafka-bootstrap:9093 --topic mm-src-cluster.mm2-topic \
--producer-property ssl.truststore.location=/tmp/certs/src.cluster.truststore.p12 \
--producer-property ssl.keystore.location=/tmp/certs/producer.keystore.p12> --producer-property security.protocol=SSL \
> --producer-property ssl.truststore.type=PKCS12 \
> --producer-property ssl.keystore.type=PKCS12 \
> --producer-property ssl.truststore.password=123456 \
> --producer-property ssl.keystore.password=123456 \
> --producer-property ssl.truststore.location=/tmp/certs/src.cluster.truststore.p12 \
> --producer-property ssl.keystore.location=/tmp/certs/producer.keystore.p12
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
>[2020-07-03 11:08:58,678] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster-kafka-bootstrap/172.30.246.8:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2020-07-03 11:08:58,839] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster-kafka-bootstrap/172.30.246.8:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)

Need help here.

scholzj commented 4 years ago

So, first of all ... are you looking at sharing the clients CA used for the TLS client authentication? Or for the Cluster CA which is used by the brokers? Also, do you want the CA to be generated by Strimzi on one cluster and use it on the other as well? Or do you want to just provide your own certificates signed by your own CA?

vperi1730 commented 4 years ago

Ideally, I am looking for Q2 and Q3.

scholzj commented 4 years ago

So ... with Q2 and Q3 I assume you mean Cluster CA + generated by Strimzi? There are probably two options:

1) You can first deploy the cluster A. Once it is deployed, you can take the Cluster CA secrets and copy them to the cluster B which you will deploy configured with a custom CA as described here: https://strimzi.io/docs/operators/latest/full/using.html#installing-your-own-ca-certificates-str In this case, you will be responsible for handling the renewals etc. in the cluster B yourself when the certificate is renewed in cluster A by Strimzi.

2) You can first deploy cluster A. After it is deployed, you can take its CA, manually (e.g. using OpenSSL) generate new certificates and use them as custom listener certificates (see https://strimzi.io/docs/operators/latest/full/using.html#kafka-listener-certificates-str). You would deploy the Cluster B with its own Strimzi managed CA, and just set the custom certificates for the listeners. You would still need to handle renewals in Cluster B, but not for the whole CA but just for the listeners.

I think what you should consider as a better option is to actually deploy both clusters with their own CAs. And then generate your own listener certificates (either with your own CA, or for example signed by Let's Encrypt etc.) and set them on both sides as custom listener certificates (https://strimzi.io/docs/operators/latest/full/using.html#kafka-listener-certificates-str). That would be probably what I would do.

vperi1730 commented 4 years ago

Thank you so much for giving valuable options, We will try it out.

Another question which I want to ask is, What would be your approach if you want to replicate Kafka users from Src cluster to target cluster as part of MM2 as it doesn't support automatically?

scholzj commented 4 years ago

I think I would try to follow some GitOps approach to have them in a single source of truth for example in some Git repository and mirror them into the clusters. But the question is what exactly do you want to mirror - users? ACLs? Do you need to mirror the exact passwords for example? (or certificates?)

You could also use for example the OAuth authentication and authroization which can keep the users in independent centrally managed service and used by both clusters.

vperi1730 commented 4 years ago

Our requirement is to mirror the users and their ACL's only into the target cluster.

scholzj commented 4 years ago

So, for mirroring the users and ACLs, I would just store the YAMLs in the Git repo and have some script to periodically apply them to make sure they are in sync (or use one of the existing GitOps tools). This should work fine for the user and ACLs. That said this will create different passwords / client certificates on each cluster.

vperi1730 commented 4 years ago

OK, I will look into this approach. Also, I am looking on how do we get root CA, Is this to be manually created with OpenSSL or can we get it from our current cluster if we have any commands for it?

I am referring to the following content from the documentation.

If you want to use a cluster or clients CA which is not a Root CA, you have to include the whole chain in the certificate file. The chain should be in the following order:

The cluster or clients CA

One or more intermediate CAs

The root CA

All CAs in the chain should be configured as a CA in the X509v3 Basic Constraints.
scholzj commented 4 years ago

TBH, I'm not sure what exactly do you mean. OpenSSL can be used to issue and manage new CAs - so it is one option. But there are obviously many such applications or services.

vperi1730 commented 4 years ago

Hi,

We are trying the following approach you have suggested.

You can first deploy cluster A. After it is deployed, you can take its CA, manually (e.g. using OpenSSL) generate new certificates and use them as custom listener certificates (see https://strimzi.io/docs/operators/latest/full/using.html#kafka-listener-certificates-str). You would deploy the Cluster B with its own Strimzi managed CA, and just set the custom certificates for the listeners. You would still need to handle renewals in Cluster B, but not for the whole CA but just for the listeners.

When you say "you can take it's CA", are you referring to Cluster CA or Client CA or both??. I have tried something like below

kubectl get secrets -n kafka-mirror mm-src-cluster-cluster-ca-cert -o jsonpath='{.data.ca\.crt}' | base64 -id > src-cluster-ca.crt

kubectl get secrets -n kafka-mirror mm-src-cluster-cluster-ca -o jsonpath='{.data.ca\.key}' | base64 -id > src-cluster-ca.key

kubectl create secret generic mm2-src-cluster-secret -n  kafka-mirror --from-file=src-cluster-ca.key --from-file=src-cluster-ca.crt

configuration:
brokerCertChainAndKey:
            certificate: src-cluster-ca.crt
            key: src-cluster-ca.key
            secretName: mm2-src-cluster-secret

With the above change, I was able to deploy Kafka CR successfully, however, while MM2 is coming up it is throwing me an error ---

2020-07-07 10:03:20,234 INFO [AdminClient clientId=adminclient-1] Failed authentication with mm-backup-cluster-kafka-bootstrap.kafka-mirror.svc/172.30.240.142 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [kafka-admin-client-thread | adminclient-1]
2020-07-07 10:03:20,236 WARN [AdminClient clientId=adminclient-1] Metadata update failed due to authentication error (org.apache.kafka.clients.admin.internals.AdminMetadataManager) [kafka-admin-client-thread | adminclient-1]
org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
        at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1521)
        at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:528)
        at sun.security.ssl.SSLEngineImpl.writeAppRecord(SSLEngineImpl.java:1197)
        at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:1165)
        at javax.net.ssl.SSLEngine.wrap(SSLEngine.java:469)
        at org.apache.kafka.common.network.SslTransportLayer.handshakeWrap(SslTransportLayer.java:448)
        at org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:313)
        at org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:265)
        at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:170)
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:547)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:483)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:540)
        at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1196)
        at java.lang.Thread.run(Thread.java:748)
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
        at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1709)
        at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:318)
        at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310)
        at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639)
        at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223)
        at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037)
        at sun.security.ssl.Handshaker$1.run(Handshaker.java:970)
        at sun.security.ssl.Handshaker$1.run(Handshaker.java:967)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1459)
        at org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:402)
        at org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:484)
        at org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:340)
        ... 7 more
Caused by: java.security.cert.CertificateException: No name matching mm-backup-cluster-kafka-bootstrap.kafka-mirror.svc found
        at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:231)
        at sun.security.util.HostnameChecker.match(HostnameChecker.java:96)
        at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:462)
        at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:428)
        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:261)
        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
        at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1626)
        ... 16 more
Internal bootstrap which is already up and running.
service/mm-backup-cluster-kafka-bootstrap            ClusterIP      172.30.240.142   <none>           9091/TCP,9092/TCP,9093/TCP,9404/TCP

Is it because under the tls listener in kafka.yaml i have added the configuration with brokerCertChainAndKey which is pointing to only the Cluster CA and Key

Sorry for the long explanation, I hope you get it :)

2) Is this the correct way to configure

tls: 
           configuration:
             brokerCertChainAndKey:
               secretName: mm2-src-cluster-secret
               certificate: src-cluster-ca.crt
               key: src-cluster-ca.key
             brokerCertChainAndKey:
               secretName: mm2-src-client-secret
               certificate: src-client-ca.crt
               key: src-client-ca.key
           authentication:
             type: tls

Here is the error

2020-07-07 11:48:13,858 ERROR [Consumer clientId=consumer-mirrormaker2-cluster-2, groupId=mirrormaker2-cluster] Connection to node 1 (mm-backup-cluster-kafka-1.mm-backup-cluster-kafka-brokers.kafka-mirror.svc/10.124.24.196:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient) [KafkaBasedLog Work Thread - mirrormaker2-cluster-status]
2020-07-07 11:48:13,858 ERROR Error polling: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed (org.apache.kafka.connect.util.KafkaBasedLog) [KafkaBasedLog Work Thread - mirrormaker2-cluster-status]
2020-07-07 11:48:13,904 INFO [Producer clientId=producer-1] Failed authentication with mm-backup-cluster-kafka-0.mm-backup-cluster-kafka-brokers.kafka-mirror.svc/10.124.16.78 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [kafka-producer-network-thread | producer-1]
2020-07-07 11:48:13,904 ERROR [Producer clientId=producer-1] Connection to node 0 (mm-backup-cluster-kafka-0.mm-backup-cluster-kafka-brokers.kafka-mirror.svc/10.124.16.78:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient) [kafka-producer-network-thread | producer-1]
2020-07-07 11:48:14,010 INFO [Producer clientId=producer-1] Failed authentication with mm-backup-cluster-kafka-2.mm-backup-cluster-kafka-brokers.kafka-mirror.svc/10.124.18.17 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [kafka-producer-network-thread | producer-1]
2020-07-07 11:48:14,010 ERROR [Producer clientId=producer-1] Connection to node 2 (mm-backup-cluster-kafka-2.mm-backup-cluster-kafka-brokers.kafka-mirror.svc/10.124.18.17:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient) [kafka-producer-network-thread | producer-1]
2020-07-07 11:48:14,037 INFO [Worker clientId=connect-1, groupId=mirrormaker2-cluster] Herder stopped (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [Thread-10]
2020-07-07 11:48:14,038 INFO Kafka Connect stopped (org.apache.kafka.connect.runtime.Connect) [Thread-10]

Please suggest.

scholzj commented 4 years ago

I think you skipped the important step ... manually (e.g. using OpenSSL) generate new certificates ... using the CA and the OpenSSL to generate a new signed certificate for the listener which will have all the valid hostnames inside. Without that you get the error you got and would need to disable the hostname verification.

Is this the correct way to configure

No - the brokerCertChainAndKey block can be there only once. This is only about the broker certificate. It has nothing to do about the client CA.

vperi1730 commented 4 years ago

OK, Could you please throw some examples of manual creation certs, I am not sure completely on this.

2) Also when you say using CA - Is it using Cluster/Client/Both?

scholzj commented 4 years ago

This is just about the Cluster CA. Only the original option 1 could be applied to both CAs. The rest is just cluster certificates and cluster CA. I do not have an examples. I would need to figure it out from 0, sorry. But normally, there is lot of guides out there, so you just need to google for them.

vperi1730 commented 4 years ago

Hi Scholzj,

Here is the final try I made with the Option:1

I have verified the secrets in the Cluster2 with Cluster1 and they are identical. (Both Cluster and Clients CA). Now as part of the next step I have created a truststore in Cluster2 and a Keystore for an existing Kafkauser and tried internal bootstrap on 9093. I have noticed failed authentication SSL handshake failed.....

[kafka@mm-backup-cluster-kafka-0 kafka]$ ./bin/kafka-console-producer.sh --broker-list mm-backup-cluster-kafka-bootstrap:9093 --topic mm-src-cluster.mm2-topic \
> --producer-property security.protocol=SSL \
> --producer-property ssl.truststore.type=PKCS12 \
> --producer-property ssl.keystore.type=PKCS12 \
> --producer-property ssl.truststore.password=123456 \
> --producer-property ssl.keystore.password=123456 \
> --producer-property ssl.truststore.location=/tmp/certs/cluster.truststore.p12 \
> --producer-property ssl.keystore.location=/tmp/certs/producer.keystore.p12
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
>[2020-07-07 17:44:57,940] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster-kafka-bootstrap/172.30.240.142:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2020-07-07 17:44:58,052] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster-kafka-bootstrap/172.30.240.142:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2020-07-07 17:44:58,465] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster-kafka-bootstrap/172.30.240.142:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2020-07-07 17:44:59,631] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster-kafka-bootstrap/172.30.240.142:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)

Am i missing anything, Need help here.

scholzj commented 4 years ago

You should pass -Djavax.net.debug=ssl to the client to give us more details about the TLS error. You will probably need to pass it through KAFKA_OPTS or something similar. ¨that might give us better idea what the problem is.

vperi1730 commented 4 years ago

mm2-ca-signature.log

I am attaching the log, Please check.

scholzj commented 4 years ago

This is the certificate the cluster is presenting:

[
  Version: V1
  Subject: CN=cluster-ca v0, O=io.strimzi
  Signature Algorithm: SHA256withRSA, OID = 1.2.840.113549.1.1.11

  Key:  Sun RSA public key, 2048 bits
  modulus: 25406301712410923036675338676757543195919330417596353703913256585004372099185158325117683495451411177090770129887844715205227580065475653509678997312219048621758486950003698422873345884970826731120046642388290812124023903606076120808972886578085029813196653178667862497188205853017309696660124402742770025593522312009200177480812407161212917687124422838813246366651439971768609428489197058300420820128933521895652485400184366491146139884750038340342563828707111664730357374382347210666621038165961971427671413400643038365383297901372443633959578223945691821033378600476207490785806733851949616755128037371166626126887
  public exponent: 65537
  Validity: [From: Tue Jul 07 14:51:14 UTC 2020,
               To: Wed Jul 07 14:51:14 UTC 2021]
  Issuer: CN=cluster-ca v0, O=io.strimzi
  SerialNumber: [    ef6c62a8 80e57131]

]

Whereas this is the one you have in the truststore:

adding as trusted cert:
  Subject: CN=cluster-ca v0, O=io.strimzi
  Issuer:  CN=cluster-ca v0, O=io.strimzi
  Algorithm: RSA; Serial number: 0x9bf70fcb11361cd0
  Valid from Thu May 28 05:45:47 UTC 2020 until Fri May 28 05:45:47 UTC 2021

The dates are different, these are not identical. I assume you either have wrong truststore or you incorrectly copied and set up the custom CA on the second cluster. You have to proceed exactly as in the docs - including the exact keys in the secret, exact labels and disabling the generating of the CA in the Kafka CR etc.

vperi1730 commented 4 years ago

Hi Scholzj,

Just for double confirmation, Can you please check the following runs i made to see if there is any wrong step which you can catch.

//Copied the secrets from Src to Backup cluster
kubectl get secret mm-src-cluster-clients-ca --namespace=kafka-mirror-src --export -o yaml |\
   kubectl apply --namespace=kafka-mirror -f -

  kubectl get secret mm-src-cluster-clients-ca-cert --namespace=kafka-mirror-src --export -o yaml |\
   kubectl apply --namespace=kafka-mirror -f -

   kubectl get secret mm-src-cluster-cluster-ca --namespace=kafka-mirror-src --export -o yaml |\
   kubectl apply --namespace=kafka-mirror -f -

   kubectl get secret mm-src-cluster-cluster-ca-cert --namespace=kafka-mirror-src --export -o yaml |\
   kubectl apply --namespace=kafka-mirror -f -

//Getting the crt and key.
1) kubectl get secrets -n kafka-mirror mm-src-cluster-cluster-ca-cert -o jsonpath='{.data.ca\.crt}' | base64 -id > src-cluster-ca.crt

1 a)  kubectl get secrets -n kafka-mirror mm-src-cluster-cluster-ca -o jsonpath='{.data.ca\.key}' | base64 -id > src-cluster-ca.key

2)  kubectl get secrets -n kafka-mirror mm-src-cluster-clients-ca-cert -o jsonpath='{.data.ca\.crt}' | base64 -id > src-client-ca.crt

2 a)  kubectl get secrets -n kafka-mirror mm-src-cluster-clients-ca -o jsonpath='{.data.ca\.key}' | base64 -id > src-client-ca.key

//Secrets Creation.
kubectl create secret generic mm-backup-cluster-cluster-ca-cert -n kafka-mirror --from-file=ca.crt=src-cluster-ca.crt

kubectl create secret generic mm-backup-cluster-cluster-ca -n kafka-mirror --from-file=ca.key=src-cluster-ca.key

kubectl create secret generic mm-backup-cluster-clients-ca-cert -n kafka-mirror --from-file=ca.crt=src-client-ca.crt

kubectl create secret generic mm-backup-cluster-cluster-ca-cert -n kafka-mirror --from-file=ca.key=src-client-ca.key

//Labelling the secrets for Cluster and Clients CA.
kubectl label secret mm-backup-cluster-clients-ca  -n kafka-mirror strimzi.io/kind=Kafka strimzi.io/cluster=mm-backup-cluster
kubectl label secret mm-backup-cluster-clients-ca-cert -n kafka-mirror strimzi.io/kind=Kafka strimzi.io/cluster=mm-backup-cluster
kubectl label secret mm-backup-cluster-cluster-ca  -n kafka-mirror strimzi.io/kind=Kafka strimzi.io/cluster=mm-backup-cluster
kubectl label secret mm-backup-cluster-cluster-ca-cert  -n kafka-mirror strimzi.io/kind=Kafka strimzi.io/cluster=mm-backup-cluster

kind: Kafka
  metadata:
    name: mm-backup-cluster
  spec:
    clusterCa:
      generateCertificateAuthority: false
    clientsCa:
      generateCertificateAuthority: false
vperi1730 commented 4 years ago

Also after creating the secrets, labels, and disabling the flags to false in Kafka CR, I have also noticed somehow zookeeper is down.

Any clue from where it went wrong?

status:
  conditions:
  - lastTransitionTime: 2020-07-08T05:42:39+0000
    message: Failed to connect to Zookeeper mm-backup-cluster-zookeeper-0.mm-backup-cluster-zookeeper-nodes.kafka-mirror.svc:2181,mm-backup-cluster-zookeeper-1.mm-backup-cluster-zookeeper-nodes.kafka-mirror.svc:2181,mm-backup-cluster-zookeeper-2.mm-backup-cluster-zookeeper-nodes.kafka-mirror.svc:2181.
      Connection was not ready in 300000 ms.
    reason: ZookeeperScalingException
    status: "True"
    type: NotReady
  observedGeneration: 6
scholzj commented 4 years ago

Are you doing it on existing cluster? Or is it a new freshly setup cluster?

vperi1730 commented 4 years ago

It is an existing cluster, I mean I have Cluster1 and Cluster2, following this procedure in Cluster2.

So I am currently re-deploying the cluster with some minor changes like below reason being we need the truststore to be identical across both the clusters but not the client ca/client ca-cert. So I am leaving it to Strimzi for generation Client CA but the cluster CA I will follow the procedure again for generating secrets, labeling.

Is this making sense Scholzj?

kind: Kafka
  metadata:
    name: mm-backup-cluster
  spec:
    clusterCa:
      generateCertificateAuthority: false
    clientsCa:
      generateCertificateAuthority: true
scholzj commented 4 years ago

I'm not sure anyone ever tried switching an existing cluster to custom CA. So I think the cluster 2 where you reuse the certificate should be ideally freshly setup.

vperi1730 commented 4 years ago

OK, Ya I have got the same doubt, So I am starting a setup of cluster 2. Will check and revert back shortly.

vperi1730 commented 4 years ago

Yes, it looks like it has worked. I have re-deployed the new cluster 2 with Cluster CA copied from CLuster1 and Clients CA generated by strimzi. In this case, both the truststore across the clusters are identical and when I passed the trust store in my producer and consumer shell script along with user Keystore I was able to send and receive messages successfully on a particular topic.

[kafka@mm-backup-cluster-kafka-0 kafka]$ ./bin/kafka-console-consumer.sh --bootstrap-server mm-backup-cluster-kafka-bootstrap:9093 --topic mm-src-cluster.mm2-topic \
> --consumer-property security.protocol=SSL \
> --consumer-property ssl.truststore.type=PKCS12 \
> --consumer-property ssl.keystore.type=PKCS12 \
> --consumer-property ssl.truststore.password=123456 \
> --consumer-property ssl.keystore.password=123456 \
> --consumer-property group.id=mm-backup-consumer-grp \
> --consumer-property ssl.truststore.location=/tmp/certs/cluster.truststore.p12 \
> --consumer-property ssl.keystore.location=/tmp/certs/mm02.producer.keystore.p12  --from-beginning
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
ok
welcome2

Here cluster.truststore.p12 is pointing to the Cluster1 CA which I have copied as part of the procedure into Cluster 2.

Any comments from your end Scholzj to make sure I have not missed anything?

scholzj commented 4 years ago

Looks good to me. Glad it finally worked. Just remember to set a calendar reminder that you will need to update the CA in one year when it expires.

vperi1730 commented 4 years ago

Sure, Thank you so much for the support. I will revert back if I face any other issues while testing.

For now, closing the ticket.

vperi1730 commented 4 years ago

On the similar lines, When we export a kafkauser and it's secret from Cluster1 to Cluster2 like below... Do we need to take care of any labeling for this??. The reason being after export I have created a Keystore that threw SSL handshake.

kubectl get secret mm-consumer-user --namespace=kafka-mirror-src --export -o yaml |\
   kubectl apply --namespace=kafka-mirror-tgt -f -

kubectl get kafkauser mm-consumer-user --namespace=kafka-mirror-src --export -o yaml |\
   kubectl apply --namespace=kafka-mirror-tgt  -f -

./bin/kafka-console-producer.sh --broker-list mm-backup-cluster-kafka-bootstrap:9093 --topic mm-src-cluster.mm2-july8-topic. \
--producer-property security.protocol=SSL \
--producer-property ssl.truststore.type=PKCS12 \
--producer-property ssl.keystore.type=PKCS12 \
--producer-property ssl.truststore.password=123456 \
--producer-property ssl.keystore.password=123456 \
--producer-property ssl.truststore.location=/tmp/certs/cluster.truststore.p12 \
--producer-property ssl.keystore.location=/tmp/certs/mmconuser.producer.keystore.p12

Here mmconuser.producer.keystore.p12 is the user exported with the above kubectl commands.

scholzj commented 4 years ago

That does not work like this for TLS certificates - only for the SASL SCRAM-SHA-512 authentication. The broker trusts the Client CA - not the client certificates. So when you copy the user secret like this, it will not make the same keystore work on both clusters. To be able to do this, you would need to copy and reuse the Clients CA in the same way you reused the Cluster CA.

Sorry if any of my comments mislead you to think this would work.

vperi1730 commented 4 years ago

OK making sense to me, So do I need to set up the Cluster2 again with change in type from tls to scram-sha-512 or can I just update the cluster with this change and perform testing with username and password?

scholzj commented 4 years ago

You for sure don't need to setup the cluster again to change from TLS to SCRAM. You can just change it in the KAfka CR. But using SCRAM obviously impacts all your clients etc.

You would need new setup only if you decide to copy the clients CA from the first cluster to keep using TLS with the same certificates.

vperi1730 commented 4 years ago

Understood. Let me build a new setup called Cluster3 and will try there, I don't want to disturb Cluster2.

vperi1730 commented 4 years ago

I did the above exercise of creating a new setup where I made both the parameters to false in Kafka CR and pre-provided them the needed secrets in place. With all those changes I have created the truststore and Keystore and it worked.

Hope this is the correct approach to take.

[kafka@mm-backup-cluster3-kafka-0 kafka]$ ./bin/kafka-console-producer.sh --broker-list mm-backup-cluster3-kafka-bootstrap:9093 --topic mm2-july8-topic \
> --producer-property security.protocol=SSL \
> --producer-property ssl.truststore.type=PKCS12 \
> --producer-property ssl.keystore.type=PKCS12 \
> --producer-property ssl.truststore.password=123456 \
> --producer-property ssl.keystore.password=123456 \
> --producer-property ssl.truststore.location=/tmp/certs/cluster.truststore.p12 \
> --producer-property ssl.keystore.location=/tmp/certs/mmconusertgt1.producer.keystore.p12
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
>so
>so1
>so2
>so3
>so4
>so5
so>6
>so7
>so9
[kafka@mm-backup-cluster3-kafka-0 kafka]$ ./bin/kafka-console-consumer.sh --bootstrap-server mm-backup-cluster3-kafka-bootstrap:9093 --topic mm2-july8-topic \
> --consumer-property security.protocol=SSL \
> --consumer-property ssl.truststore.type=PKCS12 \
> --consumer-property ssl.keystore.type=PKCS12 \
> --consumer-property ssl.truststore.password=123456 \
> --consumer-property ssl.keystore.password=123456 \
> --consumer-property group.id=mm-backup-consumer-grp \
> --consumer-property ssl.truststore.location=/tmp/certs/cluster.truststore.p12 \
> --consumer-property ssl.keystore.location=/tmp/certs/mmconusertgt1.producer.keystore.p12  --from-beginning
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
so
so3
so7
so1
so5
so6
so9
so2
so4
scholzj commented 4 years ago

Looks and sounds good to me.

vperi1730 commented 4 years ago

OK, awesome. I started testing with the SCRAM-SHA-512 auth mechanism. One of the changes I made in the Kafka CR is under plain i have added something like this and i have updated the cluster which has both clusterCA and clientCA as false.

listeners:
         plain: 
           authentication:
             type: scram-sha-512

After that, i have exported one of the SHA-512 users from Cluster1 to Cluster2 and then generated the decoded pwd and places inside producer.properties file as below.

security.protocol=SASL_PLAINTEXT sasl.mechanism=SCRAM-SHA-512 sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \ username="producer-sha-user" \ password="decodedpwd"; -->

echo "encoded password of the user from the secret" | base64 --decode. I have seen an invalid credentials error, Need help.

[kafka@mm-backup-cluster3-kafka-0 kafka]$ ./bin/kafka-console-producer.sh --broker-list mm-backup-cluster3-kafka-bootstrap:9092 --producer.config /tmp/producer.properties --topic mm2-july8-topic OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N

[2020-07-08 13:40:27,830] ERROR [Producer clientId=console-producer] Connection to node -1 (mm-backup-cluster3-kafka-bootstrap/172.30.167.133:9092) failed authentication due to: Authentication failed during authentication due to invalid credentials with SASL mechanism SCRAM-SHA-512 (org.apache.kafka.clients.NetworkClient)

scholzj commented 4 years ago

For the password credentials, the password is actually stored in Zookeeper. So you will need to first create the user on one server, wait for the secret, copy the secret and than crete the KafkaUser in the other cluster to make sure the secrets on both clusters are fully in sync.

vperi1730 commented 4 years ago

Interesting approach, Let me document all these ways to make it easy, Will try this Scholzj, thank you.

vperi1730 commented 4 years ago

It worked, I have documented all of these for my reference.

scholzj commented 4 years ago

Great!