Closed alex0z1 closed 1 year ago
maybe changing this to non zero value https://github.com/open-policy-agent/kube-mgmt/blob/8.0.1/pkg/configmap/configmap.go#L151
and in the new else statement here https://github.com/open-policy-agent/kube-mgmt/blob/8.0.1/pkg/configmap/configmap.go#L175-L182 (because configMap version doesn't change when OnUpdate is called by NewInformer) implement a test that will retrieve policies and if the result is empty, call https://github.com/open-policy-agent/kube-mgmt/blob/8.0.1/pkg/configmap/configmap.go#L202
We too see a similar issue and i did notice that the previous version we ran had a 60... https://github.com/open-policy-agent/kube-mgmt/compare/v0.12.1...8.0.0#diff-6aa7780e80409d3ad0fb397be31e6f2d64ab520750d4317267f7138ebcee6606L146
I think these 2 might be the same problem: https://github.com/open-policy-agent/kube-mgmt/issues/194
I think its broken with even 1 replica because when a rollout happens , it brings up a new pod and the listener triggers on the existing pod.
scenario current deployment pod 1 is healthy new release, pod 2 comes up, failure, annotation updated existing pod 1 listener triggers, its already fine so it marks it as ok pod 2 triggers again and it thinks its ok and doesn't load the rule
Folks if anyone is willing to work on this - I have some ideas how to approach the issue.
I realized that caches need to be reloaded in addition to policies, so it is more complicated than I thought.
Maybe adding liveness probe container to pod can work, use liveness container health endpont in kube-mgmt and if opa container has no policies then liveness container reports failure and kube-mgmt restarts.
Similar to this https://github.com/kubernetes-csi/livenessprobe
I have the following configuration
kube-mgmt loads configmaps from opa namespace during first pod initialization but if I kill opa container (for instance by logging into minikube node and do `pkill -f "opa run", or if the liveness probe fails for any reason) then kube-mgmt does not put configmaps into opa container anymore. I have to restart pod (or kill kube-mgmt container) or do some dummy changes in configmaps.
as result OPA container returns 404
and client gets
is there any known workaround for this ? maybe some health check for kube-mgmt to check if opa has rules loaded ? or is there a way to make kube-mgmt periodically put configmaps into OPA container's API ?