enix / x509-certificate-exporter

A Prometheus exporter to monitor x509 certificates expiration in Kubernetes clusters or standalone
MIT License
633 stars 64 forks source link

Permission denied when trying to monitor etcd certificates #135

Closed 311Ech0 closed 1 year ago

311Ech0 commented 1 year ago

Hello!

Faced a problem. Kubernetes is installed via kubespray, consists of three nodes, etcd is located on the master. Etcd certificates are located along the path /etc/ssl/etcd/ssl/ when I specify this path for watchDirectories: or watchFiles: (with files, of course), then I see an error in the daemonSet logs:

time="2023-05-03T10:11:28Z" level=info msg="starting x509-certificate-exporter version 3.6.0 (0dd5e2318ae07e33778a72f53cd6e6ac8dbe0931) (2022-10-27T14:48:05)"
time="2023-05-03T10:11:28Z" level=info msg="1 valid certificate(s) found in \"/mnt/watch/file-c5e9b51d88044edc7de3c1b540990d4fbcdf87ef/etc/kubernetes/ssl/ca.crt\""
time="2023-05-03T10:11:28Z" level=info msg="1 valid certificate(s) found in \"/mnt/watch/file-b13de134696fab137f243493eae5493902d9c057/etc/kubernetes/ssl/apiserver.crt\""
time="2023-05-03T10:11:28Z" level=info msg="1 valid certificate(s) found in \"/mnt/watch/file-303016caf2c2dcf143ae81f3bdfff1a4416e971f/etc/kubernetes/ssl/front-proxy-ca.crt\""
time="2023-05-03T10:11:28Z" level=info msg="1 valid certificate(s) found in \"/mnt/watch/file-10eec3d89e58905ed9bee454023b567740b8c61d/etc/kubernetes/ssl/front-proxy-client.crt\""
time="2023-05-03T10:11:28Z" level=info msg="1 valid certificate(s) found in \"/mnt/watch/file-9dff78b77fe2f4b0f34f77796df3fea5983e5d4b/var/lib/kubelet/pki/kubelet-client-current.pem\""
time="2023-05-03T10:11:28Z" level=info msg="1 valid certificate(s) found in \"/mnt/watch/file-88342fb5a8e3b25b77edf9314d09a8071f49ec12/etc/kubernetes/ssl/apiserver-kubelet-client.crt\""
time="2023-05-03T10:11:28Z" level=warning msg="failed to parse \"/mnt/watch/file-2c2c29989a0af2cb490cf35eeead9cf737d85f62/etc/ssl/etcd/ssl/ca.pem\", open /mnt/watch/file-2c2c29989a0af2cb490cf35eeead9cf737d85f62/etc/ssl/etcd/ssl/ca.pem: permission denied"
time="2023-05-03T10:11:28Z" level=warning msg="failed to parse \"/mnt/watch/file-44755413db0f5e28668b702a04eff611bdd662ef/etc/ssl/etcd/ssl/node-dc3tlabkubm01.pem\", open /mnt/watch/file-44755413db0f5e28668b702a04eff611bdd662ef/etc/ssl/etcd/ssl/node-dc3tlabkubm01.pem: permission denied"
time="2023-05-03T10:11:28Z" level=warning msg="failed to parse \"/mnt/watch/file-b8ae9555d965a32a4d807c3fa6e7158946987b37/etc/ssl/etcd/ssl/node-dc3tlabkubm01-key.pem\", open /mnt/watch/file-b8ae9555d965a32a4d807c3fa6e7158946987b37/etc/ssl/etcd/ssl/node-dc3tlabkubm01-key.pem: permission denied"
time="2023-05-03T10:11:29Z" level=info msg="2 valid certificate(s) found in \"/mnt/watch/kube-35a4fcc6ab089d789fecaeed428cc8392148766a/etc/kubernetes/admin.conf\""
time="2023-05-03T10:11:29Z" level=info msg="2 valid certificate(s) found in \"/mnt/watch/kube-ee9e4e203c758bb95ec439d60c16fb4e8854efaf/etc/kubernetes/scheduler.conf\""
time="2023-05-03T10:11:29Z" level=info msg="2 valid certificate(s) found in \"/mnt/watch/kube-41a6b8b147ef309a70619cafcfeb8961b3c29cae/etc/kubernetes/controller-manager.conf\""
time="2023-05-03T10:11:29Z" level=info msg="parsed 12 certificates (3 read failures)"
time="2023-05-03T10:11:29Z" level=info msg="listening on :9793"

When checking the permissions on files, I see that they belong to the user etcd

[root@dc3tlabkubm01 ssl]# ll /etc/ssl/etcd/ssl/
total 48
-rwx------. 1 etcd root 1675 Dec  9 16:56 admin-dc3tlabkubm01-key.pem
-rwx------. 1 etcd root 1383 Dec  9 16:56 admin-dc3tlabkubm01.pem
-rwx------. 1 etcd root 1675 Dec  9 16:55 ca-key.pem
-rwx------. 1 etcd root 1090 Dec  9 16:55 ca.pem
-rwx------. 1 etcd root 1675 Dec  9 16:56 member-dc3tlabkubm01-key.pem
-rwx------. 1 etcd root 1383 Dec  9 16:56 member-dc3tlabkubm01.pem
-rwx------. 1 etcd root 1679 Dec  9 16:56 node-dc3tlabkubm01-key.pem
-rwx------. 1 etcd root 1379 Dec  9 16:56 node-dc3tlabkubm01.pem
-rwx------. 1 etcd root 1675 Dec  9 16:56 node-dc3tlabkubn01-key.pem
-rwx------. 1 etcd root 1379 Dec  9 16:56 node-dc3tlabkubn01.pem
-rwx------. 1 etcd root 1679 Dec  9 16:56 node-dc3tlabkubn02-key.pem
-rwx------. 1 etcd root 1379 Dec  9 16:56 node-dc3tlabkubn02.pem

Please tell me what is the reason for this error? Thank you very much!

npdgm commented 1 year ago

Hi! By default, containers for the DaemonSet accessing node files will run as root with read-only access to mounted files. So the exporter should open files in most situations. Assuming you don't have any security hardening on your cluster that could interfere with hostPath mounts, or an admission controller that altered objects deployed by Helm.

Could you please share a dump of a running Pod producing those errors? kubectl get -o yaml pod PODNAME

Also I wonder if the path to /etc/ssl/etcd/ssl/ could contain a symlink or ACL. I'm not familiar with kubespray. Could you please show the output for these command on the node?

stat /etc/ssl/etcd
stat /etc/ssl/etcd/ssl
getfacl /etc/ssl/etcd
getfacl /etc/ssl/etcd/ssl

And are you running Docker, containerd, or an other CRI?

Cheers

311Ech0 commented 1 year ago

Thanks for such a quick response!

values.yaml for helm chart


---

    imagePullSecrets: []

    image:
      registry: docker.io
      repository: enix/x509-certificate-exporter
      tag: ""
      tagSuffix: ""
      pullPolicy: IfNotPresent

    hostPathsExporter:
      debugMode: true
      restartPolicy: Always
      updateStrategy: {}
      resources:
        limits:
          cpu: 100m
          memory: 40Mi
        requests:
          cpu: 10m
          memory: 20Mi

    # DaemonSet for control-plane
      daemonSets:
        cp:
          nodeSelector:
            node-role.kubernetes.io/control-plane: ""
          tolerations:
          - effect: NoSchedule
            key: node-role.kubernetes.io/master
            operator: Exists
          watchFiles:
          - /etc/kubernetes/ssl/ca.crt
          - /etc/kubernetes/ssl/apiserver.crt
          - /etc/kubernetes/ssl/front-proxy-ca.crt
          - /etc/kubernetes/ssl/front-proxy-client.crt
          - /var/lib/kubelet/pki/kubelet-client-current.pem
          - /etc/kubernetes/ssl/apiserver-kubelet-client.crt
          - /etc/ssl/etcd/ssl/ca.pem
          - /etc/ssl/etcd/ssl/node-dc3tlabkubm01.pem
          - /etc/ssl/etcd/ssl/node-dc3tlabkubm01-key.pem

          watchDirectories:
          - /etc/kubernetes/ssl/etcd/
          # - /etc/kubernetes/ssl/
          # - /var/lib/kubelet/pki/

          watchKubeconfFiles:
          - /etc/kubernetes/admin.conf
          - /etc/kubernetes/scheduler.conf
          - /etc/kubernetes/controller-manager.conf

    # DaemonSet for worker-nodes
        nodes:
          tolerations:
          - effect: NoSchedule
            key: node-role.kubernetes.io/ingress
            operator: Exists
          watchFiles:
          - /var/lib/kubelet/pki/kubelet-client-current.pem
          - /var/lib/kubelet/pki/kubelet.crt
          - /etc/kubernetes/ssl/ca.crt

    podListenPort: 9793

    hostNetwork: true

    secretsExporter:
      podAnnotations:
        prometheus.io/port: "9793"
        prometheus.io/scrape: "true"

    service:
      create: false
      port: 9793
      annotations: {}
      extraLabels: {}

    prometheusServiceMonitor:
      create: false
      scrapeInterval: 60s
      scrapeTimeout: 30s
      extraLabels: {}
      relabelings: []

    prometheusPodMonitor:
      create: false
      scrapeInterval: 60s
      scrapeTimeout: 30s
      extraLabels: {}
      relabelings: []

    prometheusRules:
      create: false
      alertOnReadErrors: true
      readErrorsSeverity: warning
      alertOnCertificateErrors: true
      certificateErrorsSeverity: warning
      certificateRenewalsSeverity: warning
      certificateExpirationsSeverity: critical
      warningDaysLeft: 30
      criticalDaysLeft: 14
      extraLabels: {}
      alertExtraLabels: {}
      alertExtraAnnotations: {}
      rulePrefix: ""
      disableBuiltinAlertGroup: false
      extraAlertGroups: []

    kubeVersion: ""

$ kubectl get po -o yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2023-05-03T11:09:05Z"
  generateName: x509-certificate-exporter-cp-
  labels:
    app.kubernetes.io/instance: x509-certificate-exporter
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: x509-certificate-exporter
    app.kubernetes.io/version: 3.6.0
    controller-revision-hash: 5cdd77df67
    helm.sh/chart: x509-certificate-exporter-3.6.0
    pod-template-generation: "45"
  name: x509-certificate-exporter-cp-mrckz
  namespace: cert-monitoring
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: x509-certificate-exporter-cp
    uid: 3a905c2f-9342-426c-ba6e-5cc6bdfd7022
  resourceVersion: "22711520"
  uid: d857dc4d-ad46-4fac-9ef1-b73fd36c9375
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - dc3tlabkubm01
  containers:
  - args:
    - --debug
    - --listen-address=:9793
    - --trim-path-components=3
    - --watch-dir=/mnt/watch/dir-84e1bb30db42c24ff12c6bcb2b6ce37093874467//etc/kubernetes/ssl/etcd
    - --watch-file=/mnt/watch/file-c5e9b51d88044edc7de3c1b540990d4fbcdf87ef//etc/kubernetes/ssl/ca.crt
    - --watch-file=/mnt/watch/file-b13de134696fab137f243493eae5493902d9c057//etc/kubernetes/ssl/apiserver.crt
    - --watch-file=/mnt/watch/file-303016caf2c2dcf143ae81f3bdfff1a4416e971f//etc/kubernetes/ssl/front-proxy-ca.crt    
    - --watch-file=/mnt/watch/file-10eec3d89e58905ed9bee454023b567740b8c61d//etc/kubernetes/ssl/front-proxy-client.crt
    - --watch-file=/mnt/watch/file-9dff78b77fe2f4b0f34f77796df3fea5983e5d4b//var/lib/kubelet/pki/kubelet-client-current.pem
    - --watch-file=/mnt/watch/file-88342fb5a8e3b25b77edf9314d09a8071f49ec12//etc/kubernetes/ssl/apiserver-kubelet-client.crt
    - --watch-file=/mnt/watch/file-2c2c29989a0af2cb490cf35eeead9cf737d85f62//etc/ssl/etcd/ssl/ca.pem
    - --watch-file=/mnt/watch/file-44755413db0f5e28668b702a04eff611bdd662ef//etc/ssl/etcd/ssl/node-dc3tlabkubm01.pem
    - --watch-file=/mnt/watch/file-b8ae9555d965a32a4d807c3fa6e7158946987b37//etc/ssl/etcd/ssl/node-dc3tlabkubm01-key.pem
    - --watch-kubeconf=/mnt/watch/kube-35a4fcc6ab089d789fecaeed428cc8392148766a//etc/kubernetes/admin.conf
    - --watch-kubeconf=/mnt/watch/kube-ee9e4e203c758bb95ec439d60c16fb4e8854efaf//etc/kubernetes/scheduler.conf
    - --watch-kubeconf=/mnt/watch/kube-41a6b8b147ef309a70619cafcfeb8961b3c29cae//etc/kubernetes/controller-manager.conf
    - --max-cache-duration=300s
    image: docker.io/enix/x509-certificate-exporter:3.6.0
    imagePullPolicy: IfNotPresent
    name: x509-certificate-exporter
    ports:
    - containerPort: 9793
      hostPort: 9793
      name: metrics
      protocol: TCP
    resources:
      limits:
        cpu: 100m
        memory: 40Mi
      requests:
        cpu: 10m
        memory: 20Mi
    securityContext:
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
      runAsGroup: 0
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /mnt/watch/dir-84e1bb30db42c24ff12c6bcb2b6ce37093874467//etc/kubernetes/ssl/etcd
      name: dir-84e1bb30db42c24ff12c6bcb2b6ce37093874467
      readOnly: true
    - mountPath: /mnt/watch/file-c5e9b51d88044edc7de3c1b540990d4fbcdf87ef//etc/kubernetes/ssl
      name: file-c5e9b51d88044edc7de3c1b540990d4fbcdf87ef
      readOnly: true
    - mountPath: /mnt/watch/file-b13de134696fab137f243493eae5493902d9c057//etc/kubernetes/ssl
      name: file-b13de134696fab137f243493eae5493902d9c057
      readOnly: true
    - mountPath: /mnt/watch/file-303016caf2c2dcf143ae81f3bdfff1a4416e971f//etc/kubernetes/ssl
      name: file-303016caf2c2dcf143ae81f3bdfff1a4416e971f
      readOnly: true
    - mountPath: /mnt/watch/file-10eec3d89e58905ed9bee454023b567740b8c61d//etc/kubernetes/ssl
      name: file-10eec3d89e58905ed9bee454023b567740b8c61d
      readOnly: true
    - mountPath: /mnt/watch/file-9dff78b77fe2f4b0f34f77796df3fea5983e5d4b//var/lib/kubelet/pki
      name: file-9dff78b77fe2f4b0f34f77796df3fea5983e5d4b
      readOnly: true
    - mountPath: /mnt/watch/file-88342fb5a8e3b25b77edf9314d09a8071f49ec12//etc/kubernetes/ssl
      name: file-88342fb5a8e3b25b77edf9314d09a8071f49ec12
      readOnly: true
    - mountPath: /mnt/watch/file-2c2c29989a0af2cb490cf35eeead9cf737d85f62//etc/ssl/etcd/ssl
      name: file-2c2c29989a0af2cb490cf35eeead9cf737d85f62
      readOnly: true
    - mountPath: /mnt/watch/file-44755413db0f5e28668b702a04eff611bdd662ef//etc/ssl/etcd/ssl
      name: file-44755413db0f5e28668b702a04eff611bdd662ef
      readOnly: true
    - mountPath: /mnt/watch/file-b8ae9555d965a32a4d807c3fa6e7158946987b37//etc/ssl/etcd/ssl
      name: file-b8ae9555d965a32a4d807c3fa6e7158946987b37
      readOnly: true
    - mountPath: /mnt/watch/kube-35a4fcc6ab089d789fecaeed428cc8392148766a//etc/kubernetes
      name: kube-35a4fcc6ab089d789fecaeed428cc8392148766a
      readOnly: true
    - mountPath: /mnt/watch/kube-ee9e4e203c758bb95ec439d60c16fb4e8854efaf//etc/kubernetes
      name: kube-ee9e4e203c758bb95ec439d60c16fb4e8854efaf
      readOnly: true
    - mountPath: /mnt/watch/kube-41a6b8b147ef309a70619cafcfeb8961b3c29cae//etc/kubernetes
      name: kube-41a6b8b147ef309a70619cafcfeb8961b3c29cae
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-gt9n5
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: dc3tlabkubm01
  nodeSelector:
    node-role.kubernetes.io/control-plane: ""
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: x509-certificate-exporter-node
  serviceAccountName: x509-certificate-exporter-node
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/pid-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/network-unavailable
    operator: Exists
  volumes:
  - hostPath:
      path: /etc/kubernetes/ssl/etcd
      type: Directory
    name: dir-84e1bb30db42c24ff12c6bcb2b6ce37093874467
  - hostPath:
      path: /etc/kubernetes/ssl
      type: Directory
    name: file-c5e9b51d88044edc7de3c1b540990d4fbcdf87ef
  - hostPath:
      path: /etc/kubernetes/ssl
      type: Directory
    name: file-b13de134696fab137f243493eae5493902d9c057
  - hostPath:
      path: /etc/kubernetes/ssl
      type: Directory
    name: file-303016caf2c2dcf143ae81f3bdfff1a4416e971f
  - hostPath:
      path: /etc/kubernetes/ssl
      type: Directory
    name: file-10eec3d89e58905ed9bee454023b567740b8c61d
  - hostPath:
      path: /var/lib/kubelet/pki
      type: Directory
    name: file-9dff78b77fe2f4b0f34f77796df3fea5983e5d4b
  - hostPath:
      path: /etc/kubernetes/ssl
      type: Directory
    name: file-88342fb5a8e3b25b77edf9314d09a8071f49ec12
  - hostPath:
      path: /etc/ssl/etcd/ssl
      type: Directory
    name: file-2c2c29989a0af2cb490cf35eeead9cf737d85f62
  - hostPath:
      path: /etc/ssl/etcd/ssl
      type: Directory
    name: file-44755413db0f5e28668b702a04eff611bdd662ef
  - hostPath:
      path: /etc/ssl/etcd/ssl
      type: Directory
    name: file-b8ae9555d965a32a4d807c3fa6e7158946987b37
  - hostPath:
      path: /etc/kubernetes
      type: Directory
    name: kube-35a4fcc6ab089d789fecaeed428cc8392148766a
  - hostPath:
      path: /etc/kubernetes
      type: Directory
    name: kube-ee9e4e203c758bb95ec439d60c16fb4e8854efaf
  - hostPath:
      path: /etc/kubernetes
      type: Directory
    name: kube-41a6b8b147ef309a70619cafcfeb8961b3c29cae
  - name: kube-api-access-gt9n5
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T11:09:05Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T11:09:08Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T11:09:08Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T11:09:05Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://8847f8fc9d9e81acc264fce1c9cff2e814d4a8de7591bfa18c5575864153d4a7
    image: docker.io/enix/x509-certificate-exporter:3.6.0
    imageID: docker.io/enix/x509-certificate-exporter@sha256:9b101ec107c86ccf54924cb3862f245e3ffbcea6ebfed34d7d547f3d6e2fd4fa
    lastState: {}
    name: x509-certificate-exporter
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-05-03T11:09:07Z"
  hostIP: 10.226.68.157
  phase: Running
  podIP: 10.226.68.157
  podIPs:
  - ip: 10.226.68.157
  qosClass: Burstable
  startTime: "2023-05-03T11:09:05Z"

and

[root@dc3tlabkubm01 ~]# stat /etc/ssl/etcd
  File: ‘/etc/ssl/etcd’
  Size: 37              Blocks: 0          IO Block: 4096   directory
Device: fd00h/64768d    Inode: 9702150     Links: 3
Access: (0700/drwx------)  Uid: (  997/    etcd)   Gid: (    0/    root)
Context: unconfined_u:object_r:cert_t:s0
Access: 2023-05-03 14:25:02.141242800 +0600
Modify: 2022-12-09 16:55:29.830361600 +0600
Change: 2022-12-09 16:55:29.830361600 +0600
 Birth: -
[root@dc3tlabkubm01 ~]# stat /etc/ssl/etcd/ssl
  File: ‘/etc/ssl/etcd/ssl’
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: fd00h/64768d    Inode: 17679792    Links: 2
Access: (0700/drwx------)  Uid: (  997/    etcd)   Gid: (    0/    root)
Context: unconfined_u:object_r:cert_t:s0
Access: 2023-05-03 15:44:01.179242800 +0600
Modify: 2023-05-03 15:43:59.828242800 +0600
Change: 2023-05-03 15:43:59.828242800 +0600
 Birth: -
[root@dc3tlabkubm01 ~]# getfacl /etc/ssl/etcd
getfacl: Removing leading '/' from absolute path names
# file: etc/ssl/etcd
# owner: etcd
# group: root
user::rwx
group::---
other::---

[root@dc3tlabkubm01 ~]# getfacl /etc/ssl/etcd/ssl
getfacl: Removing leading '/' from absolute path names
# file: etc/ssl/etcd/ssl
# owner: etcd
# group: root
user::rwx
group::---
other::---
npdgm commented 1 year ago

Thanks! Interesting..

Could you also share the output of these commands, please?

# lsb_release -a
# sestatus
# ls -ladZ / /etc /etc/ssl /etc/ssl/etcd
# ls -laZ /etc/ssl/etcd/ssl
311Ech0 commented 1 year ago

OS CentOS 7 (+docker)

[root@dc3tlabkubm01 ~]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          permissive
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      31
[root@dc3tlabkubm01 ~]# ls -ladZ / /etc /etc/ssl /etc/ssl/etcd
dr-xr-xr-x. root root system_u:object_r:root_t:s0      /
drwxr-xr-x. root root system_u:object_r:etc_t:s0       /etc
drwxr-xr-x. root root system_u:object_r:cert_t:s0      /etc/ssl
drwx------. etcd root unconfined_u:object_r:cert_t:s0  /etc/ssl/etcd
[root@dc3tlabkubm01 ~]# ls -laZ /etc/ssl/etcd/ssl
drwx------. etcd root unconfined_u:object_r:cert_t:s0  .
drwx------. etcd root unconfined_u:object_r:cert_t:s0  ..
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 admin-dc3tlabkubm01-key.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 admin-dc3tlabkubm01.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 ca-key.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 ca.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 member-dc3tlabkubm01-key.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 member-dc3tlabkubm01.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 node-dc3tlabkubm01-key.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 node-dc3tlabkubm01.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 node-dc3tlabkubn01-key.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 node-dc3tlabkubn01.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 node-dc3tlabkubn02-key.pem
-rwx------. etcd root unconfined_u:object_r:user_tmp_t:s0 node-dc3tlabkubn02.pem
npdgm commented 1 year ago

Pretty sure this has to be a SELinux problem. While it's not incompatible with Kubernetes it requires extra care. Many installation guides for k8s advise to disable it, unsurprisingly. We're quickly going to drift out of topic going down the SELinux troubleshooting rabbit hole. But it would be helpful to at least confirm it's the cause for your issue.

Would you be able to temporarily disable SELinux on one node and see if certificate errors go away? Running setenforce 0 should be sufficient. Or otherwise investigate AVC audit events, and look for a denied access to /etc/ssl.

If confirmed, then I guess there's going to be three options:

311Ech0 commented 1 year ago

Thanks a lot for your help, adding securityContext.privileged did the trick.

# DaemonSet for control-plane
  daemonSets:
    cp:
      securityContext:
        privileged: true 
      nodeSelector:
        node-role.kubernetes.io/control-plane: ""
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      watchFiles:
      - /etc/kubernetes/ssl/ca.crt
      - /etc/kubernetes/ssl/apiserver.crt
      - /etc/kubernetes/ssl/front-proxy-ca.crt
      - /etc/kubernetes/ssl/front-proxy-client.crt
      - /var/lib/kubelet/pki/kubelet-client-current.pem
      - /etc/kubernetes/ssl/apiserver-kubelet-client.crt
      - /etc/ssl/etcd/ssl/ca.pem
      - /etc/ssl/etcd/ssl/admin-dc3tlabkubm01.pem
      - /etc/ssl/etcd/ssl/node-dc3tlabkubm01.pem
      - /etc/ssl/etcd/ssl/node-dc3tlabkubn02.pem
311Ech0 commented 1 year ago

I close as solved

npdgm commented 1 year ago

Good idea, that also works :) Glad you found a fix.