Open alenhodzic85 opened 2 years ago
That is really odd. There's just not much going on there, just making sure the config is valid. If you edit the validating webhook: k edit ValidatingWebhookConfiguration openunison-workflow-validation-orchestra
and change all the timeoutSeconds: 5
--> timeoutSeconds: 30
does the problem keep happening?
Still failing:
Error: cannot patch "azuread-load-groups" with kind AuthenticationChain: Internal error occurred: failed calling webhook "authchains-openunison.tremolo.io": Post "https://openunison-orchestra.openunison.svc:443/k8s/webhooks/v1/authchains?timeout=30s": context deadline exceeded
│
│ with module.infra-services.helm_release.openunison-orchestra-login-azuread,
│ on infra-services/openunison.tf line 79, in resource "helm_release" "openunison-orchestra-login-azuread":
│ 79: resource "helm_release" "openunison-orchestra-login-azuread" {
Odd. In the openunison-orchestra logs, do you see /k8s/webhooks/v1/authchains?timeout=30s
in the logs? Also, howany replicas for the openunison-orchestra pod? Can you try increasing it?
Also, any network policies in the openunison namespace?
Log from openunison-orchestra:
2022-04-29T15:27:50+02:00 [2022-04-29 13:27:50,146][Thread-22] WARN OpenShiftTarget - Unexpected result calling 'https://172.20.0.1:443/apis/openunison.tremolo.io/v1/namespaces/openunison/oidc-sessions/x6cc8b7ac-d064-4290-bfe8-3194cc80ea03x' - 404 / {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"oidc-sessions.openunison.tremolo.io \"x6cc8b7ac-d064-4290-bfe8-3194cc80ea03x\" not found","reason":"NotFound","details":{"name":"x6cc8b7ac-d064-4290-bfe8-3194cc80ea03x","group":"openunison.tremolo.io","kind":"oidc-sessions"},"code":404}
2022-04-29T15:27:50+02:00 [2022-04-29 13:27:50,146][Thread-22] WARN OpenShiftTarget - Unexpected result calling 'https://172.20.0.1:443/apis/openunison.tremolo.io/v1/namespaces/openunison/oidc-sessions/x6cc8b7ac-d064-4290-bfe8-3194cc80ea03x' - 404 / {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"oidc-sessions.openunison.tremolo.io \"x6cc8b7ac-d064-4290-bfe8-3194cc80ea03x\" not found","reason":"NotFound","details":{"name":"x6cc8b7ac-d064-4290-bfe8-3194cc80ea03x","group":"openunison.tremolo.io","kind":"oidc-sessions"},"code":404}
this is a red herring. its openunison looking to cleanup sessions. Are there network policies in the openunison namespace?
kubectl get networkpolicies -n openunison
NAME POD-SELECTOR AGE
allow-from-apiserver application=openunison-orchestra 21h
allow-from-ingress application=openunison-orchestra 21h
allow-from-prometheus application=openunison-orchestra 21h
default-deny-ingress <none> 21h
oidc-proxy-allow-from-ingress app=kube-oidc-proxy-orchestra 21h
openunison-to-activemq app=amq-orchestra 21h
if you disable the networkpolicies in the helm chart, do you get the same issue?
I added this and it is still failing. And the networkpolicies are still there. Can I delete them manually?
network_policies:
enabled: false
And it looks like a bit different error in terraform:
Error: cannot patch "azuread-load-groups" with kind AuthenticationChain: Internal error occurred: failed calling webhook "authchains-openunison.tremolo.io": Post "https://openunison-orchestra.openunison.svc:443/k8s/webhooks/v1/authchains?timeout=30s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Can I delete them manually?
sure, go ahead and just delete them. they can get restored later
Still failing
same error?
Error: cannot patch "azuread-load-groups" with kind AuthenticationChain: Internal error occurred: failed calling webhook "authchains-openunison.tremolo.io": Post "https://openunison-orchestra.openunison.svc:443/k8s/webhooks/v1/authchains?timeout=30s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
What version of k8s, which cni, how did you deploy? (Ie kubeadm)
What version of k8s
K8s AWS EKS: K8s Rev: v1.21.9-eks-0d102a7
which cni
Default EKS CNI: Amazon VPC Container network interface (CNI) plugi
how did you deploy
Deployed using EKS terraform module
thanks, i'll work to reproduce
did you enable the network policies in the values.yaml in your initial deployment, or did you enable it afterwards?
I guess they were enabled by default: https://github.com/OpenUnison/helm-charts/blob/f2b2ba7cf91c402591e1f88e563363a28bcd389e/orchestra/values.yaml#L81
i can't find a way to reproduce this issue. when you make your update, are you updating just the orchestra-login-azuread
chart? While you're waiting for the chart to timeout, can you login to openunison?
Sorry for late reply, I was on vacation. Like I mentioned I am just changing session_inactivity_timeout_seconds
in value file and triggers update for all affected charts.
While you're waiting for the chart to timeout, can you login to openunison?
Yes, nothing is redeployed or stopped.
Hi, any update on this?
unfortunately i've not been able to reproduce. The issue appears to be localized to your cluster and I don't know why. It looks like OpenUnison is taking requests. are there any other webhooks used in the cluster? Do they have any issues?
We're going to be rolling out a new kubectl plugin that automates the rollout of openunison so you don't need to run the helm charts individually. It'll account for potential timing issues.
Yes we have other webhooks like from vault. Is the new kubectl plugin already released?
What's so odd about this is the API server is able to talk to the webhook on initial install, but not aftewards. There has to be some circumstance that's causing it. When the timeout happens, are there any API server logs? Somehting that indicates a timeout or that DNS didn't resolve?
Is the new kubectl plugin already released?
I think we're going to add a flag for additional charts to be run (like the azuread one) to make it simpler. Its a common use case.
Hi, Everytime I want to update setup, I get error with webhook and need to reinstall the whole setup.
For example, here I changed just
session_inactivity_timeout_seconds
I don't see any unusual logs...