ClickHouse / clickhouse-kafka-connect

ClickHouse Kafka Connector
Apache License 2.0
150 stars 42 forks source link

Add logging for cases when Kafka authentication for Consumer / Producer is wrong #379

Closed vladislav-bakanov closed 2 months ago

vladislav-bakanov commented 5 months ago

Hi! During the deployment in k8s Confluent Kafka Connect with this plugin I encountered problem with absolutely unclear logging.

Stack:

During the deployment I got an error, that is connected with connectivity to Clickhouse:

(org.apache.kafka.connect.runtime.distributed.DistributedHerder) [KafkaBasedLog Work Thread - kafka-connect.configs]
java.util.concurrent.ExecutionException: com.clickhouse.client.ClickHouseException: Connect to https://my-cloud-endpoint.europe-west4.gcp.clickhouse.cloud:8443 [my-cloud-endpoint.europe-west4.gcp.clickhouse.cloud/IP] failed: Read timed out, server ClickHouseNode [uri=https://my-cloud-endpoint.europe-west4.gcp.clickhouse.cloud:8443/sync_test, options={sslmode=STRICT}]@418320778

I thought that the problem is with connectivity to Clickhouse, but actually - problem is connected with settings of Kafka Producer and Kafka Consumer.

Connector couldn't connect to Kafka Consumer / Kafka Producer and after fix it eventually can connect to Clickhouse instance.

Can you please, add proper logging for this case. It'd save at least several days of my life... Than you in advance ❤️

Paultagoras commented 5 months ago

Hmm, that's interesting - I'll have to see if we can set that, we don't control the wider Connect framework though (where those settings are resolved)

vladislav-bakanov commented 5 months ago

If you need any additional context/help - don't hesitate to reach out to me

Paultagoras commented 5 months ago

Hi @vladislav-bakanov ! So oddly enough, when I mess around with the consumer/producer configs (but leave the connector property configured) it gives me the errors I'd expect. Do you happen to still have the logs from around that time? Or would it be possible to reproduce the issue and capture them?

vladislav-bakanov commented 5 months ago

@Paultagoras, I'd say, that truth is that problem is on Kafka Connect side, because it doesn't tell the user any information about the configuration of consumer and producer.

There are several environment variables that should be specified for the Kafka Connect:

- CONNECT_SECURITY_PROTOCOL
- CONNECT_SASL_JAAS_CONFIG
- CONNECT_PRODUCER_SECURITY_PROTOCOL
- CONNECT_PRODUCER_SASL_MECHANISM
- CONNECT_PRODUCER_SASL_JAAS_CONFIG
- CONNECT_CONSUMER_SECURITY_PROTOCOL
- CONNECT_CONSUMER_SASL_MECHANISM
- CONNECT_CONSUMER_SASL_JAAS_CONFIG

Firstly I configured only:

- CONNECT_SECURITY_PROTOCOL
- CONNECT_SASL_JAAS_CONFIG
- CONNECT_PRODUCER_SECURITY_PROTOCOL

And started to receive error message, that I mentioned above, but after specifying other connection properties (special for Consume & Producer) - it started to work properly.

By this long description of the problem I meant: "May there be an option somehow to catch wrong connection properties within the Kafka Connect setup and raise an exception in this case by the Clickhouse driver, not by Kafka Connect service"

Do you happen to still have the logs from around that time?

Unfortunately no, logs are erased, but I can do my best to reproduce it once again, but a little bit later

Paultagoras commented 3 months ago

@vladislav-bakanov Unfortunately if this is related to the Kafka Connect configuration, it's out of our control - and in cloud environments (like Confluent Cloud or MSK) those environment values are actually obscured from us.

One thing I'd like to explore though - how did those properties prevent the connector from reaching ClickHouse? That ping failing looks like a valid error (though maybe that's what you mean, that Kafka Connect prevents it from working if it's improperly configured).

Paultagoras commented 2 months ago

Unfortunately this isn't something we can really log - it's part of the overall Kafka Connect framework (and whichever environment it lives in) so I'm going to close this. If that changes, I can reopen.