Open eshepelyuk opened 1 year ago
We faced this issue where OPA container restarts and kube-mgmt container is not aware of it, so it doesn't load the policies.
Solution that worked for us:
When policies were not properly loaded to OPA, the http post request sent on OPA pod will have the below response Request URL: https://127.0.0.1:8443 Method: POST Response code: 404 Response:
{
"code": "undefined_document",
"message": "document missing: data.system.main"
}
But when OPA policies were loaded properly the same post request will be successful with the below response
Request URL: https://127.0.0.1:8443
Method: POST
Response code: 200
Response:
{"apiVersion":"admission.k8s.io/v1beta1","kind":"AdmissionReview","response":{"allowed":true}}
Below configuration of liveness probe works fine, it keeps checking whether the response code for the HTTPS request is 200, if not it will restart the container, there by loading policies again.
livenessProbe:
exec:
command:
- sh
- -c
- rc=`wget --server-response https://127.0.0.1:8443 --post-data {} --no-check-certificate
2>&1 | awk '/^ HTTP/{print $2}'`;[ $rc -eq 200 ]
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 30
Let me know if this is a good approach. If the solution is fine, I can contribute and check in this change.
Hello @saranyareddy24 the approach is described in head of the issue. your approach is a partial case depending on your current helm chart setup, it is not covering all possible setup options.
Let me know if this is fine.
Configmap which creates start.rego
apiVersion: v1
kind: ConfigMap
metadata:
name: policy-start
labels:
openpolicyagent.org/policy: rego
data:
start.rego: |
# If kube-mgmt is not able to access this policy it will consider
# that OPA has restarted and it will try to reload the policies by restarting.
package test
description := "Policy that loads on start of OPA"
Liveness check for fetching start.rego
livenessProbe:
failureThreshold: 5
httpGet:
path: /v1/policies/default/policy-start/start.rego
port: 8181
scheme: HTTPS
initialDelaySeconds: 60
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 10
Tested on my local, the configuration works.
Hello @saranyareddy24
I do not understand the purpose of presented ConfigMap. Please describe how it's gonna work.
Hello @saranyareddy24
I do not understand the purpose of presented ConfigMap. Please describe how it's gonna work.
Hello @saranyareddy24
I've also updated issue description. Hope, the intention will be more clear.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes
Relates #189 Relates #206