cockroachdb / helm-charts

Helm charts for cockroachdb
Apache License 2.0
85 stars 148 forks source link

Change copy-certs initContainer as optional in StatefulSet #408

Open alicek106 opened 3 months ago

alicek106 commented 3 months ago

In this PR (also related to this one), an initContainer was added to the StatefulSet to copy mounted certificate secrets to an empty directory with a permission mode of 0400 (statefulset.yaml#L55-L84). However, the PR didn’t include details or references to the relevant Kubernetes bug that made these changes necessary for deploying the Helm charts. It seems that the file mode of mounted secrets wasn’t being set correctly. But in latest Kubernetes versions, at least 1.25+, this issue appears to have been resolved.

Additionally, the changes introduced in PR #62 require a restart of each CockroachDB node to rotate certificates. At my company, Devsisters, when we used the older chart (before PR #62), rotating secrets was much simpler because the Kubelet would automatically rotate the mounted secrets. All that was needed was sending a "SIGHUP" to the process. However, with the latest chart, rotating certificates provided by cert-manager now necessitates restarting each CockroachDB pod, as the secret is no longer directly mounted to the pod.

This is a cumbersome process that likely affects all users, the copy-certs initContainer should either be made optional or removed.

prafull01 commented 1 month ago

When mounting in Kubernetes, the default permission of 256 (which converts to 0400) does not work properly if securityContext.fsGroup is specified. In this case, the fsGroup overrides the default mode, changing the permissions to 0440, whereas the certificate file requires 0400 to function correctly. You can see relevant bug here.

An alternative is to mount each file (ca.crt, node.crt, and node.key) separately using subPath in the volumeMount. However, this approach has the drawback that any updates to the secret will not be reflected in the pod Refer here. Therefore, after certificate rotation, sending a SIGHUP signal will not suffice, and a rolling restart of the CockroachDB pods will be necessary.