Closed shimpa1 closed 1 year ago
Ideally that should be done on the provider side so it can detect when ip operator recovers.
But until that, we can see if livenessProbe
/readinessProbe
could be leveraged, so the pod restarts when it sees the ip operator hasn't been ready/functioning (from the provider point of view) for longer than 10 minutes
or so.
On my test provider:
The RPC node is in catching up: true state (as expected) and the provider pod is waiting for the RPC node to get to catching up: false state. Meanwhile the IP-Operator pod is waiting for the provider pod.
When the RPC node catches up with the top of the chain, the provider pod starts however the IP operator pod does not recover.
Manually restarting the IP operator pod works.
Perhaps implement a probe of some sort to check the status of the provider pod before starting the IP operator pod.
cheers,
Shimpa