Open ThomasDangleterre opened 2 years ago
Hi, We've just bounced into the same problem... we are still analyzing if it's proxy-related
Did you resolve the issue? Do you have any hint for us?
Thanks Leif
Hello,
We observed that after a restart the issue was gone. As the /health doesn't fit our need to detect that kafka message production is not working, we created a custom image with kafkacat and a script that produces messages ( on a dedicated topic, always on partition ) trough the proxy and consume them.
The script shown as below is used it in the livenessProbe of the kafka proxy's deployment , so in case of error it will trigger a restart.
#!/bin/sh
#automatically exit on error
set -e
timestamp=$(date '+%s')
echo "Sending message to $HOST:$PORT $LIVENESS_TOPIC"
# produce message in topic
echo "$HOSTNAME $timestamp" | kafkacat -P -b "$EXTERNAL_IP":"$PORT" -t "$LIVENESS_TOPIC" -p 0 \
-X security.protocol=SSL \
-X ssl.key.location=service.key \
-X ssl.certificate.location=service.cert \
-X ssl.ca.location=ca.pem
while true; do
# Consume last message of the topic and exit
payload=$( kafkacat -C -b "$HOST":"$PORT" -t "$LIVENESS_TOPIC" -o -1 -e \
-X security.protocol=SSL \
-X ssl.key.location=service.key \
-X ssl.certificate.location=service.cert \
-X ssl.ca.location=ca.pem \
)
if [ "$payload" = "$HOSTNAME $timestamp" ]; then
break;
fi;
done;
exit 0
We get some transient errors triggering restarts but overall the connection is stable now.
% ERROR: Local: All broker connections are down: 3/3 brokers are down : terminating
%3|1661178021.406|FAIL|rdkafka#producer-1| [thrd:ssl://10.48.28.151:12660/32]: ssl://10.48.28.151:12658/32: No further error information available (after 0ms in state SSL_HANDSHAKE)
Hello,
I have an issue while using kafka-proxy :
This error repeats itself without pushing data to the topic.
Here is my configuration :
We dynamically fetch IPS of our broker and fail when we get 'i/o timeout' in logs in order to improve resilience.
Does anyone knows why our producer can't produce data to the topic ?