hashicorp / consul-k8s

First-class support for Consul Service Mesh on Kubernetes
https://www.consul.io/docs/k8s
Mozilla Public License 2.0
667 stars 316 forks source link

/consul/data/node-id: permission denied on minikube #2475

Open arcenik opened 1 year ago

arcenik commented 1 year ago

Community Note


Overview of the Issue

The chart does not works on a Minikube cluster based on VM (both Virtualbox and KVM tested).

Reproduction Steps

Start the minikube cluster with

minikube start \
--driver=kvm2 \
--nodes=3 \
--cpus=2 \
--memory=4Gib

Install the chart with these values

---
global:
  logJSON: false
  name: lab
  datacenter: minikube
  metrics:
    enabled: true

server:
  storage: 500Mi

snapshotAgent:
  enabled: true

auditLogs:
  enabled: true

Logs

Logs of server pod-0

$ k -n consul logs pod/lab-server-0
==> failed to setup node ID: failed to write NodeID to disk: open /consul/data/node-id: permission denied

Expected behavior

Consul should start.

Environment details

helm list -n consul
NAME    NAMESPACE   REVISION    UPDATED                                     STATUS      CHART           APP VERSION
consul  consul      1           2023-06-28 11:16:40.662846406 +0200 CEST    deployed    consul-1.1.2    1.15.3     

Additional Context

k8s version

clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-03-15T13:33:12Z"
  compiler: gc
  gitCommit: 9e644106593f3f4aa98f8a84b23db5fa378900bd
  gitTreeState: clean
  gitVersion: v1.26.3
  goVersion: go1.19.7
  major: "1"
  minor: "26"
  platform: linux/amd64
arcenik commented 1 year ago

The problem is that /consul/data is owned by root.root

2023-06-28T14:28:21.706517247Z + id
2023-06-28T14:28:21.706829486Z uid=100(consul) gid=1000(consul) groups=1000(consul)
2023-06-28T14:28:21.706868412Z + ls -l /consul
2023-06-28T14:28:21.707640875Z 
2023-06-28T14:28:21.707642284Z /consul:
2023-06-28T14:28:21.707643698Z total 12
2023-06-28T14:28:21.707645116Z drwxrwsrwx    3 root     consul        4096 Jun 28 14:27 config
2023-06-28T14:28:21.707646616Z drwxr-xr-x    2 root     root          4096 Jun 28 14:27 data
2023-06-28T14:28:21.707648150Z drwxrwsrwx    2 root     consul        4096 Jun 28 14:28 extra-config

2023-06-28T14:37:18.474949103Z + df -h /consul/data
2023-06-28T14:37:18.476382185Z Filesystem                Size      Used Available Use% Mounted on
2023-06-28T14:37:18.476392607Z /dev/vda1                17.0G      1.4G     14.6G   9% /consul/data

A fix in charts/consul/templates/server-statefulset.yaml:

      initContainers:
        - name: fix-consul-data-owner
          image: busybox
          securityContext:
            runAsNonRoot: false
            runAsUser: 0
          command:
            - "/bin/sh"
            - "-cex"
            - "chown -R 100:1000 /consul/data"
          volumeMounts:
            - name: data-{{ .Release.Namespace | trunc 58 | trimSuffix "-" }}
              mountPath: /consul/data
holstvoogd commented 2 months ago

I ran into this issue while trying to test things with vault.

Change in the PR fixed it, except for a small syntax issue in the values.yml, (endif) made a comment on the PR šŸ‘