projectcapsule / capsule-proxy

Reverse proxy for Capsule Operator.
https://github.com/projectcapsule/capsule
Apache License 2.0
44 stars 40 forks source link

Proxy not working after 1h #565

Closed ppodevlabs closed 2 weeks ago

ppodevlabs commented 3 weeks ago

Bug description

Running capsule-proxy on AKS 1.30.3 integrated with microsoft entra ID. The proxy works properly during one hour, after this time we start getting errors regarding authentication

error: You must be logged in to the server (Unauthorized)

Restarting the proxy pod solve the issue. After some investigation it seems that the capsule-proxy lost the connectivity to the api-server once the cluster renew the service account token.

Capsule-proxy continues accepting requests and logs show groups/tenants are being detected but seems like the proxy is not capable of comunicate with the api-server.

- name: kube-api-access-bl4vt
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              items:
                - key: ca.crt
                  path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace

Expected behavior

Capsule-proxy continues working after the token is refreshed.

Logs

There are no error logs within the proxy logs.

capsule-proxy-7598b484d5-n65xw capsule-proxy {"level":"Level(-4)","ts":"2024-11-06T15:42:08.248Z","logger":"proxy","msg":"impersonating for the current request","username":"d2xxxxx-9f06-4ec88f0143e4","groups":["ac753c56-xxxxxx-8779023a437e","91ed29b0-1xxxxxxxxxe-ffbc39d049f4","system:authenticated"],"uri":"/api"}
capsule-proxy-7598b484d5-n65xw capsule-proxy {"level":"Level(-5)","ts":"2024-11-06T15:42:08.248Z","logger":"proxy","msg":"debugging request","uri":"/api?timeout=32s","method":"GET"}

Additional context

ppodevlabs commented 3 weeks ago

looking at the code, i would say this issue is related to the new service account management introduced in kubernetes 1.30 where the service account get a bound token which expires in 3607s.

The controller load the inClusterConfig on start but never refresh it, so it stop working after 1h.

ppodevlabs commented 2 weeks ago

Fixed by #569