uber / uReplicator

Improvement of Apache Kafka Mirrormaker
Apache License 2.0
917 stars 199 forks source link

Controller  is always Unhealthy #288

Closed dungnt081191 closed 5 years ago

dungnt081191 commented 5 years ago

Hi everyone , I don't know how you can work with Controller but from my mind , i have a case about health of Controller .

Everything will be fine if server UP , Kubernetes Cluster is good, every node is good . BUT until , node down , kubernetes is down - the dark is come . Worker pod is restarting with graceful shutdown -> that's great and it's correct - don't need to think about Worker . And Controller , no , it's didn't work after kubernetes node ( which contain Controller pod ) was down , Controller DID NOT WORK at all , Controller log no exception but every message in Source is in lag at this time and no replicate action from Worker anymore . I have to restart Controller manually by : kubectl delete pod controller-pod-name After that , Controller back to work , Worker doing their job .

And after this accident , i realize that : Why don't check health of this Controller ? So i try : curl localhost:9000/health -> wow , that work , but the result is always show : Unhealthy

How can i restart Controller after this message ? How can I make Controller healthy ?

Technoboy- commented 5 years ago

the current implement logic of checking controller health is : the offsetMonitor is working, so you have to config srcKafkaZkPath and groupId as the params in startup controller

yangy0000 commented 5 years ago

fixed in https://github.com/uber/uReplicator/pull/291