confluentinc / confluent-kafka-python

Confluent's Kafka Python Client
http://docs.confluent.io/current/clients/confluent-kafka-python
Other
3.72k stars 882 forks source link

SSL Handshake failed with 2.0.0+ version but works <2.0.0 #1731

Closed naoko closed 2 months ago

naoko commented 2 months ago

Description

Getting this error SSL handshake failed: error:0A000086:SSL routines::certificate verify failed: broker certificate could not be verified, verify that ssl.ca.location is correctly configured or root CA certificates are installed (brew install openssl) (after 92ms in state SSL_HANDSHAKE, 126 identical error(s) suppressed) withonfluent-kafka>=2.0.0 but works with confluent-kafka<2.0.0

I ran openssl s_client -connect kafka-url:9096 -showcerts

works fine and outputs

---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 5517 bytes and written 509 bytes
Verification: OK
---
New, TLSv1.2, Cipher is ECDHE-RSA-AES256-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-SHA384
    Session-ID: 41B7AB125271425D018112F95E302B67D3079BF7C4508F089FACFDAF0CEC2077
    Session-ID-ctx: 
    Master-Key: AFB1BF9B842443201B94A1EDE8BE8CE54F2E0EFFDD870B9CEDA11AA2B96465F0B8ADFD83716DC9E2D08858BCD4F1CF40
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1713558577
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: yes
---

I've added this "ssl.ca.location": certifi.where(), to configuration but no avail. What did changed in between 2.0.0 and prior version in terms of SSL connection?

Thank you

How to reproduce

Checklist

Please provide the following information:

pranavrth commented 2 months ago

What did changed in between 2.0.0 and prior version in terms of SSL connection?

The client is moved to OpenSSL 3.0.x in v2.0.0. Old legacy ciphers are deprecated in OpenSSL by default. If you are using some of the old ciphers then it is highly recommended to migrate to the stronger ciphers. But if you still want to use the old cipher use 'ssl.providers': 'legacy,default'. Please refer librdkafka v2.0.0 for more information.

NaokoReeves-BO commented 2 months ago

@pranavrth thank you very much for the information I overlooked.

However, I am still trying to figure it out what needs to be changed.

Well, when I change it to 'ssl.providers': 'legacy,default' the error was Failed to load OpenSSL provider "legacy" as I don't have "legacy"

openssl list -providers
Providers:
  default
    name: OpenSSL Default Provider
    version: 3.3.0
    status: active

and running openssl s_client -connect ${kafka_ssl_url} is successful

No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 5517 bytes and written 509 bytes
Verification: OK
---
New, TLSv1.2, Cipher is ECDHE-RSA-AES256-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2

So that means server side works with OpenSSL default provider 3.3.0 correct?

pranavrth commented 2 months ago

So that means server side works with OpenSSL default provider 3.3.0 correct?

I think yes but python client uses statically linked OpenSSL which is v3.0.11 version right now. %7|1713562931.990|OPENSSL|rdkafka#producer-1| [thrd:app]: Using statically linked OpenSSL version OpenSSL 3.0.11 19 Sep 2023 (0x300000b0, librdkafka built with 0x300000b0)

I was going through the change logs from 3.0.x to 3.3.x and I didn't find any major change to the default provider so it should work with default provider of v3.0.11 as well.

Can you check the default CA location for the OpenSSL 3.3.0 you have already installed and use that path in ssl.ca.location field? openssl version -a might be helpful in this.

NaokoReeves-BO commented 2 months ago
penssl version -a
OpenSSL 3.3.0 9 Apr 2024 (Library: OpenSSL 3.3.0 9 Apr 2024)
built on: Tue Apr  9 12:12:22 2024 UTC
platform: darwin64-arm64-cc
options:  bn(64,64)
compiler: clang -fPIC -arch arm64 -O3 -Wall -DL_ENDIAN -DOPENSSL_PIC -D_REENTRANT -DOPENSSL_BUILDING_OPENSSL -DNDEBUG
OPENSSLDIR: "/opt/homebrew/etc/openssl@3"

and this matches to

import ssl
ssl.get_default_verify_paths().openssl_cafile
'/opt/homebrew/etc/openssl@3/cert.pem'

So I set "ssl.ca.location": "/opt/homebrew/etc/openssl@3/cert.pem" in config.

Also further verify that this Cert is good

penssl x509 -in /opt/homebrew/etc/openssl@3/cert.pem -text -noout | grep Validity -A 2

        Validity
            Not Before: Aug  1 18:04:51 2019 GMT
            Not After : Aug  3 18:04:51 2029 GMT

Then connect to our Kafka server and handshake was successful

openssl s_client -connect ${url} -CAfile /opt/homebrew/etc/openssl@3/cert.pem

No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 5517 bytes and written 509 bytes
Verification: OK

but attempt to connect via AdminClient to create_topic resulted in

%7|1714002782.291|SASL|rdkafka#producer-1| [thrd:app]: Selected provider SCRAM (builtin) for SASL mechanism SCRAM-SHA-512
%7|1714002782.291|OPENSSL|rdkafka#producer-1| [thrd:app]: Using statically linked OpenSSL version OpenSSL 3.0.11 19 Sep 2023 (0x300000b0, librdkafka built with 0x300000b0)
%7|1714002782.292|SSL|rdkafka#producer-1| [thrd:app]: Loading CA certificate(s) from file /opt/homebrew/etc/openssl@3/cert.pem
%7|1714002782.305|BRKMAIN|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1714002782.305|BROKER|rdkafka#producer-1| [thrd:app]: sasl_ssl://${msk_url}:9096/bootstrap: Added new broker with NodeId -1
%7|1714002782.305|BRKMAIN|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Enter main broker thread
%7|1714002782.305|CONNECT|rdkafka#producer-1| [thrd:app]: sasl_ssl://${msk_url}:9096/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
%7|1714002782.305|INIT|rdkafka#producer-1| [thrd:app]: librdkafka v2.3.0 (0x20300ff) rdkafka#producer-1 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_gssapi,sasl_plain,sasl_scram,plugins,zstd,sasl_oauthbearer,http,oidc, STRIP STATIC_LINKING GCC GXX PKGCONFIG OSXLD LIBDL PLUGINS ZLIB SSL SASL_CYRUS ZSTD CURL HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER OAUTHBEARER_OIDC, debug 0x282)
%7|1714002782.305|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Received CONNECT op
%7|1714002782.305|STATE|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1714002782.305|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: broker in state TRY_CONNECT connecting
%7|1714002782.305|STATE|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1714002782.305|CONNECT|rdkafka#producer-1| [thrd:main]: Not selecting any broker for cluster connection: still suppressed for 49ms: lookup controller
%7|1714002782.421|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Connecting to ipv4#10.0.17.71:9096 (sasl_ssl) with socket 12
%7|1714002782.533|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Connected to ipv4#10.0.17.71:9096
%7|1714002782.533|STATE|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Broker changed state CONNECT -> SSL_HANDSHAKE
%7|1714002782.533|CONNECT|rdkafka#producer-1| [thrd:main]: Cluster connection already in progress: lookup controller
%7|1714002782.533|ENDPOINT|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: Enabled endpoint identification using hostname ${msk_url}
%7|1714002782.644|FAIL|rdkafka#producer-1| [thrd:sasl_ssl://${msk_url}]: sasl_ssl://${msk_url}:9096/bootstrap: SSL handshake failed: ssl/statem/statem_clnt.c:1890:tls_post_process_server_certificate error:0A000086:SSL routines::certificate verify failed: broker certificate could not be verified, verify that ssl.ca.location is correctly configured or root CA certificates are installed (brew install openssl) (after 110ms in state SSL_HANDSHAKE) (_SSL)

Any other idea as to where I can look?

pranavrth commented 2 months ago

Can you try disabling hostname verification? Use "ssl.endpoint.identification.algorithm": "none"

NaokoReeves-BO commented 2 months ago

@pranavrth Thank you so much! The disabling hostname verification did it. OpenSSL 3.x. must be doing more stricter check on hostname matching perhaps? We will check Certs to make sure it is OpenSSL 3.x compatible. Thank you again for all your patient and wisdom 🙏

pranavrth commented 2 months ago

Thank you for confirming.

Just to inform you that Python client (and all the librdkafka base clients) have enabled host name verification by default from v2.0.0. Host name verification was disabled by default before v2.0.0.