Open chenchik opened 7 years ago
Hi, were you able to figure this out?
I think the problem here is related to setting advertised.host.name
and/or the port in Connect.
The advertised address will need to be externally resolvable by that container, and new joining ones will need to be able to resolve that name.
I have been experimenting with scaling working and seeing how it improves performance or CPU usage per container/pod in openshift.
I'm stuck on a particular issue right now though where I am unable to add another hdfs connector to an already pre existing group of workers. One worker always has a log that ends up looking like an infinite series of unknown host exceptions:
If I try to remove and add another connector to simply one worker, everything works fine. Also if I start out with one worker, add a connector, and then scale the workers up to the amount of
max.tasks
, they all work great together. But I have a group of workers (2 or greater) already working together and I try to add a connector, these unknown hos exceptions pop up on one of the workers and prevent the entire group of workers from getting anything done.I'm using an updated version of the hdfs connector where there was an equality bug which was fixed recently in their github. I took the jar file from the Jenkin's build:
Here is the issue I'm referencing:
https://github.com/confluentinc/kafka-connect-hdfs/issues/132
Jenkins build:
https://jenkins.confluent.io/job/kafka-connect-hdfs-pr/99/io.confluent$kafka-connect-hdfs/
This is probably also probably related to the fact that when I try to make anything except a GET request to a group of pods/containers. Most of the time, I get this kind of response:
Both of these issues were occurring all the time before I upgraded my hdfs connector as well. How can I prevent this unknown host exception from happening?