Open eviln1 opened 1 month ago
Thank you for raising this issue. it make sense to make the controller fail faster in its initialization phase if some required resources does not installed in the cluster, and to make helm chart atomic: true
behavior work as intended.
We will look into implementing this fast-fail logic, it's not quite hard to add it.
Thank you again for your input, and we appreciate your patience as we work on improving the controller.
Our team got hit by https://github.com/aws/aws-application-networking-k8s/issues/658 today. The proposal in https://github.com/aws/aws-application-networking-k8s/issues/659 would help a lot.
Additionally, I think that the controller should fail fast(er).
We use
helm
to install the controller, with theatomic: true
option set; the rationale is that if the pods can't become ready, helm rolls back to the previous release.Currently, the controller will become ready, but fail after a couple of minutes and go into
CrashLoopBackOff
.Having the controller check for pre-requisites before becoming ready would prevent this behavior.