Hi everyone ,
I don't know how you can work with Controller but from my mind , i have a case about health of Controller .
Everything will be fine if server UP , Kubernetes Cluster is good, every node is good .
BUT until , node down , kubernetes is down - the dark is come . Worker pod is restarting with graceful shutdown -> that's great and it's correct - don't need to think about Worker .
And Controller , no , it's didn't work after kubernetes node ( which contain Controller pod ) was down , Controller DID NOT WORK at all , Controller log no exception but every message in Source is in lag at this time and no replicate action from Worker anymore .
I have to restart Controller manually by : kubectl delete pod controller-pod-name
After that , Controller back to work , Worker doing their job .
And after this accident , i realize that : Why don't check health of this Controller ?
So i try : curl localhost:9000/health -> wow , that work , but the result is always show : Unhealthy
How can i restart Controller after this message ?How can I make Controller healthy ?
the current implement logic of checking controller health is : the offsetMonitor is working, so you have to config srcKafkaZkPath and groupId as the params in startup controller
Hi everyone , I don't know how you can work with Controller but from my mind , i have a case about health of Controller .
Everything will be fine if server UP , Kubernetes Cluster is good, every node is good . BUT until , node down , kubernetes is down - the dark is come . Worker pod is restarting with graceful shutdown -> that's great and it's correct - don't need to think about Worker . And Controller , no , it's didn't work after kubernetes node ( which contain Controller pod ) was down , Controller DID NOT WORK at all , Controller log no exception but every message in Source is in lag at this time and no replicate action from Worker anymore . I have to restart Controller manually by :
kubectl delete pod controller-pod-name
After that , Controller back to work , Worker doing their job .And after this accident , i realize that : Why don't check health of this Controller ? So i try :
curl localhost:9000/health
-> wow , that work , but the result is always show :Unhealthy
How can i restart Controller after this message ? How can I make Controller healthy ?