open-policy-agent / kube-mgmt

Sidecar for managing OPA instances in Kubernetes.
Apache License 2.0
238 stars 106 forks source link

bug: crashloopbackoff for "unknown" watches in 8.5.x #260

Open kevdowney opened 2 months ago

kevdowney commented 2 months ago

Seeing crashes using 8.5.x version: kube-mgmt:8.5.8

kubectl get pods
NAME                   READY   STATUS             RESTARTS      AGE
opa-79fcc56cdd-4q5l7   2/3     CrashLoopBackOff   6 (8s ago)    9m48s
opa-79fcc56cdd-hd96z   2/3     CrashLoopBackOff   5 (95s ago)   8m45s
kubectl logs opa-79fcc56cdd-nj79h -c kube-mgmt
time="2024-08-26T17:48:12Z" level=info msg="Policy/data ConfigMap processor connected to K8s: namespaces=[]"
E0826 17:48:12.424657       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
time="2024-08-26T17:48:12Z" level=info msg="Initial informer sync for v1/serviceaccounts completed, took 100.110118ms"
time="2024-08-26T17:48:12Z" level=info msg="Syncing v1/serviceaccounts."
time="2024-08-26T17:48:12Z" level=info msg="Initial informer sync for v1/namespaces completed, took 100.104679ms"
time="2024-08-26T17:48:12Z" level=info msg="Syncing v1/namespaces."
time="2024-08-26T17:48:12Z" level=info msg="Initial informer sync for v1/nodes completed, took 100.047158ms"
time="2024-08-26T17:48:12Z" level=info msg="Syncing v1/nodes."
time="2024-08-26T17:48:12Z" level=info msg="Loaded 133 resources of kind v1/serviceaccounts into OPA. Took 39.533653ms"
time="2024-08-26T17:48:12Z" level=info msg="Loaded 36 resources of kind v1/namespaces into OPA. Took 40.737732ms"
time="2024-08-26T17:48:12Z" level=info msg="Loaded 16 resources of kind v1/nodes into OPA. Took 47.807383ms"
E0826 17:48:13.397307       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 17:48:16.332353       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 17:48:21.992081       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 17:48:30.780212       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 17:48:55.304518       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown

Using these args:

      - args:
        - --replicate-cluster=v1/namespaces
        - --replicate-cluster=v1/nodes
        - --replicate-cluster=v1/serviceaccounts
        - --policies=opa
        - --require-policy-label=true
        - --opa-url=https://127.0.0.1:8443/v1
        - --opa-allow-insecure=true

We see the same kind of errors in 8.4.0 but no crashes.

kubectl logs opa-c7886947f-kbg6b -c kube-mgmt
time="2024-08-26T16:30:44Z" level=info msg="Policy/data ConfigMap processor connected to K8s: namespaces=[opa]"
E0826 16:30:44.465353       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/admission.rego, err=code invalid_parameter: error(s) occurred while compiling module(s)"
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/containers.rego, err=<nil>"
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/helpers.rego, err=<nil>"
time="2024-08-26T16:30:44Z" level=info msg="Initial informer sync for v1/namespaces completed, took 101.038623ms"
time="2024-08-26T16:30:44Z" level=info msg="Syncing v1/namespaces."
time="2024-08-26T16:30:44Z" level=info msg="Initial informer sync for v1/serviceaccounts completed, took 101.080936ms"
time="2024-08-26T16:30:44Z" level=info msg="Syncing v1/serviceaccounts."
time="2024-08-26T16:30:44Z" level=info msg="Initial informer sync for v1/nodes completed, took 100.889071ms"
time="2024-08-26T16:30:44Z" level=info msg="Syncing v1/nodes."
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/image.rego, err=<nil>"
time="2024-08-26T16:30:44Z" level=info msg="Loaded 36 resources of kind v1/namespaces into OPA. Took 32.966228ms"
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/istio.rego, err=<nil>"
time="2024-08-26T16:30:44Z" level=info msg="Loaded 17 resources of kind v1/nodes into OPA. Took 65.188539ms"
time="2024-08-26T16:30:44Z" level=info msg="Loaded 133 resources of kind v1/serviceaccounts into OPA. Took 119.451253ms"
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/main.rego, err=code invalid_parameter: error(s) occurred while compiling module(s)"
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies/privileges.rego, err=<nil>"
time="2024-08-26T16:30:44Z" level=info msg="Added policy opa/policies-preprod-artifactory-override/allow_artifactory_preprod.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/admission.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/containers.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/helpers.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/image.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/istio.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/main.rego, err=<nil>"
time="2024-08-26T16:30:45Z" level=info msg="Added policy opa/policies/privileges.rego, err=<nil>"
E0826 16:30:45.569381       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 16:30:48.419411       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 16:30:53.343317       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 16:31:03.178708       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 16:31:17.614971       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 16:32:00.389407       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown
E0826 16:32:45.024632       1 reflector.go:138] k8s.io/client-go@v0.23.8/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown

Opa RBAC:

kubectl auth can-i --as=system:serviceaccount:opa:opa-sa --list
Warning: the list may be incomplete: webhook authorizer does not support user rule resolution
Resources                                         Non-Resource URLs                      Resource Names   Verbs
workflowtaskresults.argoproj.io                   []                                     []               [create delete deletecollection get list patch update watch]
selfsubjectreviews.authentication.k8s.io          []                                     []               [create]
selfsubjectaccessreviews.authorization.k8s.io     []                                     []               [create]
selfsubjectrulesreviews.authorization.k8s.io      []                                     []               [create]
nodes                                             []                                     []               [get list patch watch]
configmaps                                        []                                     []               [get list watch update patch]
bindings                                          []                                     []               [get list watch]
endpoints                                         []                                     []               [get list watch]
events                                            []                                     []               [get list watch]
limitranges                                       []                                     []               [get list watch]
namespaces/status                                 []                                     []               [get list watch]
namespaces                                        []                                     []               [get list watch]
persistentvolumeclaims/status                     []                                     []               [get list watch]
persistentvolumeclaims                            []                                     []               [get list watch]
pods/log                                          []                                     []               [get list watch]
pods/status                                       []                                     []               [get list watch]
pods                                              []                                     []               [get list watch]
replicationcontrollers/scale                      []                                     []               [get list watch]
replicationcontrollers/status                     []                                     []               [get list watch]
replicationcontrollers                            []                                     []               [get list watch]
resourcequotas/status                             []                                     []               [get list watch]
resourcequotas                                    []                                     []               [get list watch]
serviceaccounts                                   []                                     []               [get list watch]
services/status                                   []                                     []               [get list watch]
services                                          []                                     []               [get list watch]
controllerrevisions.apps                          []                                     []               [get list watch]
daemonsets.apps/status                            []                                     []               [get list watch]
daemonsets.apps                                   []                                     []               [get list watch]
deployments.apps/scale                            []                                     []               [get list watch]
deployments.apps/status                           []                                     []               [get list watch]
deployments.apps                                  []                                     []               [get list watch]
replicasets.apps/scale                            []                                     []               [get list watch]
replicasets.apps/status                           []                                     []               [get list watch]
replicasets.apps                                  []                                     []               [get list watch]
statefulsets.apps/scale                           []                                     []               [get list watch]
statefulsets.apps/status                          []                                     []               [get list watch]
statefulsets.apps                                 []                                     []               [get list watch]
clusterworkflowtemplates.argoproj.io/finalizers   []                                     []               [get list watch]
clusterworkflowtemplates.argoproj.io              []                                     []               [get list watch]
cronworkflows.argoproj.io/finalizers              []                                     []               [get list watch]
cronworkflows.argoproj.io                         []                                     []               [get list watch]
workflowartifactgctasks.argoproj.io               []                                     []               [get list watch]
workfloweventbindings.argoproj.io/finalizers      []                                     []               [get list watch]
workfloweventbindings.argoproj.io                 []                                     []               [get list watch]
workflows.argoproj.io/finalizers                  []                                     []               [get list watch]
workflows.argoproj.io                             []                                     []               [get list watch]
workflowtasksets.argoproj.io/finalizers           []                                     []               [get list watch]
workflowtasksets.argoproj.io                      []                                     []               [get list watch]
workflowtemplates.argoproj.io/finalizers          []                                     []               [get list watch]
workflowtemplates.argoproj.io                     []                                     []               [get list watch]
horizontalpodautoscalers.autoscaling/status       []                                     []               [get list watch]
horizontalpodautoscalers.autoscaling              []                                     []               [get list watch]
cronjobs.batch/status                             []                                     []               [get list watch]
cronjobs.batch                                    []                                     []               [get list watch]
jobs.batch/status                                 []                                     []               [get list watch]
jobs.batch                                        []                                     []               [get list watch]
endpointslices.discovery.k8s.io                   []                                     []               [get list watch]
daemonsets.extensions/status                      []                                     []               [get list watch]
daemonsets.extensions                             []                                     []               [get list watch]
deployments.extensions/scale                      []                                     []               [get list watch]
deployments.extensions/status                     []                                     []               [get list watch]
deployments.extensions                            []                                     []               [get list watch]
ingresses.extensions/status                       []                                     []               [get list watch]
ingresses.extensions                              []                                     []               [get list watch]
networkpolicies.extensions                        []                                     []               [get list watch]
replicasets.extensions/scale                      []                                     []               [get list watch]
replicasets.extensions/status                     []                                     []               [get list watch]
replicasets.extensions                            []                                     []               [get list watch]
replicationcontrollers.extensions/scale           []                                     []               [get list watch]
nodes.metrics.k8s.io                              []                                     []               [get list watch]
pods.metrics.k8s.io                               []                                     []               [get list watch]
ingresses.networking.k8s.io/status                []                                     []               [get list watch]
ingresses.networking.k8s.io                       []                                     []               [get list watch]
networkpolicies.networking.k8s.io                 []                                     []               [get list watch]
poddisruptionbudgets.policy/status                []                                     []               [get list watch]
poddisruptionbudgets.policy                       []                                     []               [get list watch]
                                                  [/.well-known/openid-configuration/]   []               [get]
                                                  [/.well-known/openid-configuration]    []               [get]
                                                  [/api/*]                               []               [get]
                                                  [/api]                                 []               [get]
                                                  [/apis/*]                              []               [get]
                                                  [/apis]                                []               [get]
                                                  [/healthz]                             []               [get]
                                                  [/healthz]                             []               [get]
                                                  [/livez]                               []               [get]
                                                  [/livez]                               []               [get]
                                                  [/openapi/*]                           []               [get]
                                                  [/openapi]                             []               [get]
                                                  [/openid/v1/jwks/]                     []               [get]
                                                  [/openid/v1/jwks]                      []               [get]
                                                  [/readyz]                              []               [get]
                                                  [/readyz]                              []               [get]
                                                  [/version/]                            []               [get]
                                                  [/version/]                            []               [get]
                                                  [/version]                             []               [get]
                                                  [/version]                             []               [get]

Not sure what this watch is?

E0826 17:48:12.424657       1 reflector.go:138] k8s.io/client-go@v0.23.17/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: unknown

Also, see this difference in the logs from 8.5.x vs. 8.4.x and 8.x:

8.5.x

time="2024-08-26T17:48:12Z" level=info msg="Policy/data ConfigMap processor connected to K8s: namespaces=[]"

8.4.x and below

time="2024-08-26T16:30:44Z" level=info msg="Policy/data ConfigMap processor connected to K8s: namespaces=[opa]"