Closed saranyaeu2987 closed 3 years ago
I think there are some Prometheus metrics with connector and task startup failures. I guess these can be used?
I wonder if a connector with failed tasks should also have some condition noting it and not be just ready. WDYT @tombentley?
@scholzj it might make it slightly easier to detect failed tasks if there were a single condition to look out for, rather than having to iterate a list. But I guess it would also need to fit nicely with https://github.com/strimzi/proposals/blob/master/007-restarting-kafka-connect-connectors-and-tasks.md, so it's clear whether a connector has failed, but is being restarted (so there's no immediate action necessary) v.s. failed and either auto restart is disabled or the max number of restarts has been exceeded (i.e. do something).
@tombentley @scholzj Why the failed tasks are not restarting/removed? I am seeing data loss with failed tasks.
I also see failure due to "org.apache.kafka.connect.errors.ConnectException: Task already exists? Whats
Restarting failed tasks is currently something you have to do manually. There is a proposal for the future, but not implemented yet.
I do not know what the error means - I never saw it. I would not expect this to cause any message loss. The tasks are not running, but why should they have any messages lost?
@scholzj
Why command to use to start the task manually?
You would need to use the REST API for it: http://kafka.apache.org/documentation/#connect_rest ... you can for example exec into the Connect pod and talk to localhost:8083.
message count in topic and destination are not matching. I thought it was because of failed task. Can there by any other reasons? Any guidance on finding the reason for data loss?
My expectation would be that if the task is not running, the messages might be delayed or not forwarded. But I think that if they are lost then you need to increase the retention on the topic or the connector has some bugs.
@scholzj
Different question
kubectl get kc
NAME DESIRED REPLICAS
emd-kafka-cluster 1
pharma-kafka-cluster 3
I see the message in the topic, but not in destination. So would it be the issue with the connector?
I do not know even what connector are you talking about. But it can be that it just doesn't send the messages, but will do it once it is running again. Which is not the same as losing them. You would need to check the offsets to see whether they were already consumed or not.
Can I have 2 different kafka connect cluster with different kctr running in respective kafkaconnect? (something like below)
Not sure I follow ... you can have two connect clusters each connected to different Kafka of course. But you cannot have one Connect cluster which is in the same time connected to multiple Kafka clusters and having different connectors for each of them. (unless you have a connector which has its own client and connects to Kafka on both sides of course, which is how Mirror Maker 2 works.)
Not sure I follow ... you can have two connect clusters each connected to different Kafka of course. But you cannot have one Connect cluster which is in the same time connected to multiple Kafka clusters and having different connectors for each of them. (unless you have a connector which has its own client and connects to Kafka on both sides of course, which is how Mirror Maker 2 works.)
2 connect clusters connecting to same kafka cluster, but listens to different topics and push data to different destination. Does strimzi allow it?
Oh yes, you can have as many Connects connecting to same Kafka cluster as you want. But you have to keep this in mind: https://strimzi.io/docs/operators/latest/full/using.html#con-kafka-connect-multiple-instances-deployment-configuration-kafka-connect
Each connect needs to have its own topics it will use and its own groups. So each needs to have different values for these options:
group.id: connect-cluster
offset.storage.topic: connect-cluster-offsets
config.storage.topic: connect-cluster-configs
status.storage.topic: connect-cluster-status
@saranyaeu2987 Anything more we can help with here?
I have multiple kafkaconnectors running
Some of tasks in kctr are failed with NULL pointer exception
My questions are