Oidc authentication fails sporadically with Cilium (always fails when kube-proxy is replaced)

Is there an existing issue for this?

[X] I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

When creating a new cluster and installing Cilium oidc login to the cluster fails.

This error is logged in the kube-apiserver pod:

6 02:24:25.874703       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, oidc: verify token: failed to verify signature: fetching keys oidc: get keys failed Get \"https://login.microsoftonline.com/426bbc16-09cf-4a19-80c2-d5b19c6c4b72/discovery/v2.0/keys\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)]"
E0916 02:24:30.099244       1 conn.go:339] Error on socket receive: read tcp 10.2.0.20:6443->10.2.0.11:52086: use of closed network connection
E0916 02:24:37.776954       1 conn.go:339] Error on socket receive: read tcp 10.2.0.20:6443->10.2.0.11:34948: use of closed network connection
E0916 02:24:55.920473       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, oidc: verify token: failed to verify signature: fetching keys context deadline exceeded]"

How can we reproduce the issue?

Create a cluster with Kubeadm init and oidc authentication. Install Cilium through helm with the following values: kubeProxyReplacement: false is more reliable than when enabled, but still fails.

kubeProxyReplacement: true
k8sServiceHost: kube1-cp.cookes.io
k8sServicePort: 6443

Cilium Version

cilium-cli: v0.16.15 compiled with go1.22.5 on linux/amd64 cilium image (default): v1.16.0 cilium image (stable): v1.16.1 cilium image (running): 1.17.0-dev

Kernel Version

Linux kube1-cp-01 6.8.0-44-generic #44-Ubuntu SMP PREEMPT_DYNAMIC Tue Aug 13 13:35:26 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

1.31

Regression

No response

Sysdump

cilium-sysdump-20240916-225601.zip

Relevant log output

E0916 22:50:41.209302       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, oidc: verify token: failed to verify signature: fetching keys context deadline exceeded]"
E0916 22:50:41.209302       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, oidc: verify token: failed to verify signature: fetching keys context deadline exceeded]"
E0916 22:51:18.002679       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, oidc: verify token: failed to verify signature: fetching keys oidc: get keys failed Get \"https://login.microsoftonline.com/426bbc16-09cf-4a19-80c2-d5b19c6c4b72/discovery/v2.0/keys\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)]"

Anything else?

When using Calico everything works as expected, I tried it to make sure something wasn't wrong with my network or cluster configuration.

Cilium Users Document

[ ] Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Thanks for logging this issue @EdwardCooke, can you tell us more about how you're doing the OIDC integration? What tool are you using?

I’m using the declarative way of doing authentication with a kubeadm cluster. The provider is azure. What do you mean by what tool am I using? It’s a fresh kubeadm cluster with a single control plane node. Add cilium. Add additional control plane nodes. The additional ones fail to auth using oidc with the error logged above. If you restart the kubernetes api server on the first control plane it’ll then start failing as well.

I’ll post the auth file when I get a chance.

Presumably kubeadm is configuring the apiserver flags for you per https://kubernetes.io/docs/reference/access-authn-authz/authentication/ ? (I wasn't aware this was even a thing until just now).

I suspect this scenario isn't well tested (or at all) by Cilium's CI, so it makes sense it might not be working.

Is there any chance you can generate a sysdump using https://docs.cilium.io/en/stable/operations/troubleshooting/#automatic-log-state-collection ? Ideally, you could use an admin credential to generate that sysdump and add it to this issue, then someone will have the information they need to take a look.

Otherwise, I'd encourage you to check the packet flow from the apiserver to the OIDC provider. Are there any network policies that may be capturing that traffic? Does the apiserver have access to the greater internet (since it will need it to connect directly to the OIDC provider)?

There isn't anything in the way of the traffic. Up until installing Cilium it works fine. It's only after installing Cilium that OIDC stops working. I'll generate that sysdump shortly.

🔍 Collecting sysdump with cilium-cli version: v0.16.18, args: [sysdump] 🔮 Detected Cilium installation in namespace: "kube-system" 🔮 Detected Cilium operator in namespace: "kube-system" ℹ️ Using default Cilium Helm release name: "cilium" ℹ️ Failed to detect Cilium SPIRE installation - using Cilium namespace as Cilium SPIRE namespace: "kube-system" 🔍 Collecting Kubernetes nodes 🔮 Detected Cilium features: map[bpf-lb-external-clusterip:Disabled cidr-match-nodes:Disabled clustermesh-enable-endpoint-sync:Disabled cni-chaining:Disabled:none enable-bgp-control-plane:Disabled enable-envoy-config:Disabled enable-gateway-api:Disabled enable-ipsec:Disabled enable-ipv4-egress-gateway:Disabled enable-local-redirect-policy:Disabled endpoint-routes:Disabled ingress-controller:Disabled ipam:Disabled:cluster-pool ipv4:Enabled ipv6:Disabled mutual-auth-spiffe:Disabled wireguard-encapsulate:Disabled] 🔍 Collecting tracing data from Cilium pods 🔍 Collect Kubernetes nodes 🔍 Collecting Kubernetes events 🔍 Collect Kubernetes version 🔍 Collecting Kubernetes pods 🔍 Collecting Kubernetes namespaces 🔍 Collecting Kubernetes services 🔍 Collecting Kubernetes pods summary 🔍 Collecting Kubernetes endpoints 🔍 Collecting Kubernetes network policies 🔍 Collecting Kubernetes metrics 🔍 Collecting Kubernetes leases 🔍 Collecting Cilium cluster-wide network policies 🔍 Collecting Cilium network policies 🔍 Collecting Cilium Egress Gateway policies 🔍 Collecting Cilium egress NAT policies 🔍 Collecting Cilium local redirect policies 🔍 Collecting Cilium CIDR Groups 🔍 Collecting Cilium endpoint slices 🔍 Collecting Cilium endpoints 🔍 Collecting Cilium nodes 🔍 Collecting Cilium identities 🔍 Collecting Ingresses 🔍 Collecting Cilium Node Configs 🔍 Collecting Cilium BGP Peering Policies 🔍 Collecting IngressClasses 🔍 Collecting Cilium Pod IP Pools 🔍 Collecting Cilium LoadBalancer IP Pools 🔍 Checking if cilium-etcd-secrets exists in kube-system namespace 🔍 Collecting the Cilium configuration 🔍 Collecting the Hubble Relay configuration 🔍 Collecting the Cilium daemonset(s) 🔍 Collecting the Hubble daemonset 🔍 Collecting the Hubble Relay deployment 🔍 Collecting the Hubble UI deployment 🔍 Collecting the Cilium Envoy configuration 🔍 Collecting the Cilium Node Init daemonset 🔍 Collecting the Cilium Envoy daemonset 🔍 Collecting the Hubble generate certs cronjob W0930 10:42:23.850699 13700 warnings.go:70] cilium.io/v2alpha1 CiliumNodeConfig will be deprecated in cilium v1.16; use cilium.io/v2 CiliumNodeConfig 🔍 Collecting the Hubble cert-manager certificates 🔍 Collecting the Hubble generate certs pod logs 🔍 Collecting the Cilium operator metrics 🔍 Collecting the Cilium operator deployment 🔍 Collecting the clustermesh debug information, metrics and gops stats ⚠️ cronjob "hubble-generate-certs" not found in namespace "kube-system" - this is expected if auto TLS is not enabled or if not using hubble.auto.tls.method=cronjob ⚠️ Deployment "hubble-ui" not found in namespace "kube-system" - this is expected if Hubble UI is not enabled ⚠️ Deployment "hubble-relay" not found in namespace "kube-system" - this is expected if Hubble is not enabled 🔍 Collecting gops stats from Hubble Relay pods 🔍 Collecting profiling data from Cilium pods 🔍 Collecting logs from Cilium pods ⚠️ Daemonset "cilium-node-init" not found in namespace "kube-system" - this is expected if Node Init DaemonSet is not enabled 🔍 Collecting the 'clustermesh-apiserver' deployment 🔍 Collecting the CNI configuration files from Cilium pods 🔍 Collecting the CNI configmap 🔍 Collecting gops stats from Cilium pods 🔍 Collecting gops stats from Cilium-operator pods 🔍 Collecting gops stats from Hubble pods 🔍 Collecting bugtool output from Cilium pods 🔍 Collecting logs from Cilium Envoy pods 🔍 Collecting logs from Cilium Node Init pods 🔍 Collecting logs from Cilium operator pods 🔍 Collecting logs from 'clustermesh-apiserver' pods ⚠️ Deployment "clustermesh-apiserver" not found in namespace "kube-system" - this is expected if 'clustermesh-apiserver' isn't enabled 🔍 Collecting logs from Hubble pods 🔍 Collecting logs from Hubble Relay pods Secret "cilium-etcd-secrets" not found in namespace "kube-system" - this is expected when using the CRD KVStore 🔍 Collecting logs from Hubble UI pods I0930 10:42:25.001594 13700 request.go:697] Waited for 1.16111413s due to client-side throttling, not priority and fairness, request: GET:https://kube1-cp.cookes.io:6443/api/v1/namespaces/kube-system/configmaps/hubble-relay-config 🔍 Collecting platform-specific data 🔍 Collecting kvstore data 🔍 Collecting Cilium external workloads 🔍 Collecting Hubble flows from Cilium pods 🔍 Collecting logs from Tetragon pods 🔍 Collecting logs from Tetragon operator pods 🔍 Collecting bugtool output from Tetragon pods 🔍 Collecting Tetragon configmap 🔍 Collecting Tetragon PodInfo custom resources 🔍 Collecting Tetragon tracing policies 🔍 Collecting Tetragon namespaced tracing policies 🔍 Collecting Helm metadata from the release 🔍 Collecting Helm values from the release I0930 10:43:00.672027 13700 request.go:697] Waited for 1.002747065s due to client-side throttling, not priority and fairness, request: GET:https://kube1-cp.cookes.io:6443/api/v1/namespaces/kube-system/pods/cilium-tlslg/log?container=config&limitBytes=1073741824&sinceTime=2023-10-01T10%3A42%3A23Z&timestamps=true ⚠️ The following tasks failed, the sysdump may be incomplete: ⚠️ [13] Collecting Cilium egress NAT policies: failed to collect Cilium egress NAT policies: the server could not find the requested resource ⚠️ [14] Collecting Cilium Egress Gateway policies: failed to collect Cilium Egress Gateway policies: the server could not find the requested resource (get ciliumegressgatewaypolicies.cilium.io) ⚠️ [16] Collecting Cilium local redirect policies: failed to collect Cilium local redirect policies: the server could not find the requested resource (get ciliumlocalredirectpolicies.cilium.io) ⚠️ [18] Collecting Cilium endpoint slices: failed to collect Cilium endpoint slices: the server could not find the requested resource (get ciliumendpointslices.cilium.io) ⚠️ [24] Collecting Cilium BGP Peering Policies: failed to collect Cilium BGP Peering policies: the server could not find the requested resource (get ciliumbgppeeringpolicies.cilium.io) ⚠️ [34] Collecting the Hubble Relay configuration: failed to collect the Hubble Relay configuration: configmaps "hubble-relay-config" not found ⚠️ [39] Collecting the Hubble cert-manager certificates: failed to collect certificates (v1): the server could not find the requested resource ⚠️ [68] Collecting Tetragon PodInfo custom resources: failed to collect podinfo (v1alpha1): the server could not find the requested resource ⚠️ [69] Collecting Tetragon tracing policies: failed to collect tracingpolicies (v1alpha1): the server could not find the requested resource ⚠️ [70] Collecting Tetragon namespaced tracing policies: failed to collect tracingpoliciesnamespaced (v1alpha1): the server could not find the requested resource ⚠️ Please note that depending on your Cilium version and installation options, this may be expected 🗳 Compiling sysdump ✅ The sysdump has been saved to cilium-sysdump-20240930-104223.zip

cilium-sysdump-20240930-104223.zip

Presumably kubeadm is configuring the apiserver flags for you per https://kubernetes.io/docs/reference/access-authn-authz/authentication/ ? (I wasn't aware this was even a thing until just now).

I suspect this scenario isn't well tested (or at all) by Cilium's CI, so it makes sense it might not be working.

Is there any chance you can generate a sysdump using https://docs.cilium.io/en/stable/operations/troubleshooting/#automatic-log-state-collection ? Ideally, you could use an admin credential to generate that sysdump and add it to this issue, then someone will have the information they need to take a look.

Otherwise, I'd encourage you to check the packet flow from the apiserver to the OIDC provider. Are there any network policies that may be capturing that traffic? Does the apiserver have access to the greater internet (since it will need it to connect directly to the OIDC provider)?

To do it with Kubeadm you specify the additional API arguments to do it. I'm using the declarative method where you put it in a config file and reference that, here's the what I use

Kubeadm cluster config

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
controlPlaneEndpoint: "kube1-cp.cookes.io:6443"
controllerManager:
  extraArgs:
    node-cidr-mask-size: "24"
    # CIS 1.3.1
    terminated-pod-gc-threshold: "10"
    # TODO: Remove this so it goes back to the 1 year default, this is to test cert rotation/expiration
    cluster-signing-duration: "0h10m0s"
    # CIS 1.3.2
    profiling: "FALSE"
    # STIG V-242378
    tls-min-version: VersionTLS13
networking:
  serviceSubnet: "10.96.0.0/16"
  podSubnet: "10.244.0.0/16"
  dnsDomain: "cluster.local"
apiServer:
  extraArgs:
    # CIS 1.2.16
    # STIG V-242465
    # STIG V-242402
    audit-log-path: /var/log/apiserver/audit.log
    # CIS 1.2.17
    # STIG V-242464
    audit-log-maxage: "30"
    # CIS 1.2.18
    # STIG V-242463
    audit-log-maxbackup: "10"
    # CIS 1.2.19
    # STIG V-242462
    audit-log-maxsize: "100"
    # CIS 1.2.29
    tls-cipher-suites: "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA"
    # CIS 3.2.1
    # STIG V-242461
    audit-policy-file: "/etc/kubernetes/config/audit-policy.yaml"
    # CIS 1.2.6, 1.2.7, 1.2.8
    # STIG V-242382
    authorization-mode: Node,RBAC
    # CIS 3.1.1, 3.1.2, 3.1.3
    authentication-config: "/etc/kubernetes/config/kube-api-authn.yaml"
    # CIS 1.2.11
    enable-admission-plugins: AlwaysPullImages,NodeRestriction,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,DefaultIngressClass,DefaultStorageClass,DefaultTolerationSeconds,LimitRanger,MutatingAdmissionWebhook,NamespaceLifecycle,PersistentVolumeClaimResize,PodSecurity,Priority,ResourceQuota,RuntimeClass,ServiceAccount,StorageObjectInUseProtection,TaintNodesByCondition,ValidatingAdmissionPolicy,ValidatingAdmissionWebhook
    # CIS 1.2.27
    encryption-provider-config: "/etc/kubernetes/config/encryption.yaml"
    encryption-provider-config-automatic-reload: "true"
    # CIS 1.2.5
    kubelet-certificate-authority: "/etc/kubernetes/pki/ca.crt"
    # CIS 1.2.15
    profiling: "FALSE"
    # STIG V-254800
    admission-control-config-file: "/etc/kubernetes/config/admission-configuration.yaml"
    # STIG V-242378
    tls-min-version: VersionTLS13
    service-account-issuer: "https://kube1.cookes.io"
  certSANs:
  - "kube1-cp.cookes.io"
  extraVolumes:
    - name: auth
      hostPath: "/etc/kubernetes/config/kube-api-authn.yaml"
      mountPath: "/etc/kubernetes/config/kube-api-authn.yaml"
      readOnly: true
      pathType: File
    - name: encryption-config
      hostPath: "/etc/kubernetes/config/encryption.yaml"
      mountPath: "/etc/kubernetes/config/encryption.yaml"
      readOnly: true
      pathType: File
    - name: audit-policy
      hostPath: "/etc/kubernetes/config/audit-policy.yaml"
      mountPath: "/etc/kubernetes/config/audit-policy.yaml"
      readOnly: true
      pathType: File
    - name: audit-log
      hostPath: /var/log/apiserver
      mountPath: /var/log/apiserver
      readOnly: false
      pathType: DirectoryOrCreate
    - name: admission-configuration
      hostPath: "/etc/kubernetes/config/admission-configuration.yaml"
      mountPath: "/etc/kubernetes/config/admission-configuration.yaml"
      readOnly: true
      pathType: File
  timeoutForControlPlane: 4m0s
scheduler:
  extraArgs:
    authentication-tolerate-lookup-failure: "false"
    # CIS 1.4.1
    profiling: "FALSE"
    # STIG V-242377
    tls-min-version: VersionTLS13
clusterName: "kube1"
etcd:
  local:
    extraArgs:
      # STIG V-242380
      peer-auto-tls: "false"
      # STIG V-242379
      auto-tls: "false"

Init config:

apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  kubeletExtraArgs:
    cloud-provider: "external"
    node-ip: "10.2.0.20"
patches:
  directory: /etc/kubernetes/config/patches

And authn config:

apiVersion: apiserver.config.k8s.io/v1beta1
kind: AuthenticationConfiguration
jwt:
- issuer:
    url: https://login.microsoftonline.com/426bbc16-09cf-4a19-80c2-d5b19c6c4b72/v2.0
    audiences:
    - f6a6e027-18d6-431f-a310-5a1b9b09942d
  claimMappings:
    # username represents an option for the username attribute.
    # This is the only required attribute.
    username:
      claim: "upn"
      prefix: "oidc:"
    groups:
      claim: "roles"
      prefix: "oidc:"
      # Mutually exclusive with groups.claim and groups.prefix.
      # expression is a CEL expression that evaluates to a string or a list of strings.
      # expression: 'claims.roles.split(",")'
    # uid represents an option for the uid attribute.
    uid:
      claim: 'oid'

Just finally figured out what happened, and it was a mistake on my part. My network is on the 10.0.0.0/8 subnet which is what the default is for Cilium. I was under the impression that it would use the cluster pod cidrs assigned when I did the kubeadm init, which it did not. As soon as I changed my cilium values to the below values and rebuilt the cluster (it's fresh and empty) everything worked fine

ipam:
  mode: cluster-pool
  operator:
    clusterPoolIPv4PodCIDRList:
    - 10.96.0.0/16
    clusterPoolIPv4MaskSize: 24

cilium / cilium