confluentinc / confluent-kafka-python

Confluent's Kafka Python Client
http://docs.confluent.io/current/clients/confluent-kafka-python
Other
3.73k stars 882 forks source link

SSL communication and Segment fault issue in version 2.0.2 & above #1690

Open Vikash08Mishra opened 6 months ago

Vikash08Mishra commented 6 months ago

Description

Facing issue while trying communication to Kafka over SSL via Admin Client. Configuration: {'bootstrap.servers': 'X.X.X.X:X', 'security.protocol': 'ssl', 'ssl.ca.location': 'ca-cert-path'}

confluent-python version: 1.9.2 works perfect but same breaks when I upgrade to any of higher version for confluent python: 2.0.2, 2.1.1, 2.2.0 & 2.3.0. It's worth noting that each of confluent-dotnet version: 2.1.1, 2.2.0 & 2.3.0 with exact same configuration and certificate works perfectly.

%3|1702380036.110|FAIL|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 6m
s in state SSL_HANDSHAKE)

Debug logs: It says broker didn't provide certificate but same works with confluent python 1.9.2 and each confluent dotnet version I mentioned above. I have replaced actual broker IP's with keyword: broker_ip in below debug logs.


%7|1702383547.987|BROKER|rdkafka#producer-1| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Added new broker with NodeId -1
%7|1702383547.987|CONNECT|rdkafka#producer-1| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
%7|1702383547.987|BRKMAIN|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1702383547.987|BRKMAIN|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Enter main broker thread
%7|1702383547.987|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Received CONNECT op
%7|1702383547.987|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1702383547.987|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: broker in state TRY_CONNECT connecting
%7|1702383547.987|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1702383547.987|INIT|rdkafka#producer-1| [thrd:app]: librdkafka v2.3.0 (0x20300ff) rdkafka#producer-1 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_plain,sasl_scram,plugins,zstd,sasl_oauth
bearer,http,oidc, STRIP STATIC_LINKING GCC GXX PKGCONFIG INSTALL GNULD LIBDL PLUGINS ZLIB SSL ZSTD CURL HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER OAUTHBEARER_OIDC CRC32C_HW, debug 0x4002)
%7|1702383547.988|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connecting to ipv4#broker-ip:9096 (ssl)
 with socket 15
%7|1702383547.988|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connected to ipv4#broker-ip:9096
%7|1702383547.988|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state CONNECT -> SSL_HANDSHAKE
%7|1702383547.994|FAIL|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 6m
s in state SSL_HANDSHAKE) (_SSL)
%3|1702383547.994|FAIL|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 6m
s in state SSL_HANDSHAKE)
%7|1702383547.994|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state SSL_HANDSHAKE -> DOWN
%7|1702383547.994|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state DOWN -> INIT
%7|1702383548.012|BROKER|rdkafka#producer-2| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Added new broker with NodeId -1
%7|1702383548.012|BRKMAIN|rdkafka#producer-2| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1702383548.012|BRKMAIN|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Enter main broker thread
%7|1702383548.012|CONNECT|rdkafka#producer-2| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Received CONNECT op
%7|1702383548.013|STATE|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: broker in state TRY_CONNECT connecting
%7|1702383548.013|STATE|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1702383548.013|INIT|rdkafka#producer-2| [thrd:app]: librdkafka v2.3.0 (0x20300ff) rdkafka#producer-2 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_plain,sasl_scram,plugins,zstd,sasl_oauth
bearer,http,oidc, STRIP STATIC_LINKING GCC GXX PKGCONFIG INSTALL GNULD LIBDL PLUGINS ZLIB SSL ZSTD CURL HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER OAUTHBEARER_OIDC CRC32C_HW, debug 0x4002)
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connecting to ipv4#broker_ip:9096 (ssl)
 with socket 19
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connected to ipv4#broker_ip:9096
%7|1702383548.013|STATE|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state CONNECT -> SSL_HANDSHAKE
%7|1702383548.014|CREATETOPICS|rdkafka#producer-2| [thrd:main]: CREATETOPICS worker called in state initializing: Success
%7|1702383548.014|ADMIN|rdkafka#producer-2| [thrd:main]: CREATETOPICS: looking up controller
%7|1702383548.014|CONNECT|rdkafka#producer-2| [thrd:main]: Not selecting any broker for cluster connection: still suppressed for 48ms: lookup controller
%7|1702383548.018|FAIL|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 5m
s in state SSL_HANDSHAKE) (_SSL)
%3|1702383548.018|FAIL|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 5m
s in state SSL_HANDSHAKE)

Do not suspect OpenSSL issue mentioned in post: https://github.com/confluentinc/confluent-kafka-python/issues/1521 as CIpher used is cipher TLS_AES_256_GCM_SHA384 . So don't think that it's a weak cipher issue, also confluent dotnet use same librdkafka for version 2.1.1, 2.2.0 and 2.3.0 which has OpenSSL >3.0 and it works fine over there with same certificate. Jus to rule out I tried setting ssl.providers=default,legacy but then I encountered segment error for each of confluent python version >=2.0.2

Python error: Segmentation fault
Current thread 0x00007fdd9b69df00 (most recent call first):
  File "/opt/bitnami/python/lib/python3.8/site-packages/confluent_kafka/admin/__init__.py", line 122 in __init__

Saw related issue: https://github.com/confluentinc/confluent-kafka-python/issues/1547 where its mentioned it's fixed but I still see same segment issue.

Any help is highly appreciated.

How to reproduce

Exists only in confluent python >=2.0.2. Same work fine for confluent python 1.9.2 and confluent versions >=2.1.1.

Checklist

Please provide the following information:

arushi315 commented 5 months ago

Hi @pranavrth
Will you be able to provide some guidance on this?

WaxWell-Bison commented 5 months ago

I'm experimenting similar issue on python 3.8.13 issue when using 'security.protocol': 'SSL' in the producer configuration.

import confluent_kafka

producer = confluent_kafka.Producer({'security.protocol': 'SSL'})
print(producer)

No issue with python 3.9.13 though

arushi315 commented 5 months ago

Thanks @WaxWell-Bison for the suggestion. It is working for us as well once we update the python version however it only works when I use the python version >= 3.12.0.

pranavrth commented 1 month ago

Is this issue still happening?