Open pnovotnak opened 6 years ago
/sig storage
potential fix in https://github.com/kubernetes/kubernetes/pull/57935
hmm, reading further, it looks like this is working as designed for fsGroup, though that design doesn't play nicely with the mode options (the API doc for those mode fields explicitly says you might end up with different options when combined with fsGroup).
different containers can run as different uids, yet still share a single secret volume. the only way for them to all be able to read the data from that volume when running as different uids is to have a common fsGroup set, and have group read permissions applied to the data in that volume. that means expanding the file permissions to include group permissions, even when that was not requested or desired.
the only real way I see to fix this is to change the 1-volume:*-containers relationship and start making per-container (or per-uid) volumes for these types, setting file owner to the resolved uid of the container (which is not always possible to resolve), but that's a major redesign/departure of the volume subsystem
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
I’m also running into an issue with ownership / permissions on a Kubernetes secret. I have a container (that is run by k8s cron, but that’s beside the point), that runs as uid=999 (pod security: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#users-and-groups). I want to use a secret, being a SSL-certificate private key for client cert auth. Using Client certificates requires permissions to be 0600., which means that ownership must be set to 999: also (since uid=999 can only read a file with permission 0600 if he’s owner). I can’t seem to be getting it to work. For now I worked around the issue by reading the file, and writing the contents to a new file on tmpfs with proper ownership / permissions. But copying contents of secrets to other locations feels a bit contra dictional to what we want to achieve (higher security).
Is this the same issue / related? Or should I submit a new?
This is very similar to an issue we are experiencing post upgrade to 1.8.11 that changes some of the mount options from RW to RO. We are using a projected volume to group several disparate secrets/configmaps into a logical volume for the underlying container.
We had fiddled with permissions before but couldn't get things "just right" so we have been running a chown
in our entrypoint and the using su-exec
to degrade the permissions of the process that is actually running. The changes introduced in #58720 made this option no longer viable as that projected volume is in RO mode now.
We attempted addressing this by:
pod.spec.securityContext
attributes for both the runAsUser
and fsGroup
pod.spec.containers.securityContext
Unfortunately we have yet to find a permutation that allows us to do what is probably "the right thing" which is:
Are other people doing essentially the suggestion above and just skirting around the issue by moving files around in their entrypoint?
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale
/remove-lifecycle rotten
@liggitt , @pnovotnak , can you answer the question of @jrnt30 ?
Are other people doing essentially the suggestion above and just skirting around the issue by moving files around in their entrypoint?
Are other people doing essentially the suggestion above and just skirting around the issue by moving files around in their entrypoint?
I'll have to defer to @kubernetes/sig-storage-misc for that question
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
/remove-lifecycle rotten
I’m also running into an issue with ownership / permissions on a Kubernetes secret. I have a container (that is run by k8s cron, but that’s beside the point), that runs as uid=999 (pod security: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#users-and-groups). I want to use a secret, being a SSL-certificate private key for client cert auth. Using Client certificates requires permissions to be 0600., which means that ownership must be set to 999: also (since uid=999 can only read a file with permission 0600 if he’s owner). I can’t seem to be getting it to work. For now I worked around the issue by reading the file, and writing the contents to a new file on tmpfs with proper ownership / permissions. But copying contents of secrets to other locations feels a bit contra dictional to what we want to achieve (higher security).
Is this the same issue / related? Or should I submit a new?
Hi @sebasmannem, I have run into the exact same issue, also only in a cron job. Same code works for a regular pod. Did you manage to find a more elegant solution?
@sebasmannem @jrnt30, sorry for the delay. But could you describe more details about the problems? When you try to use defaultMode and/or fsgroup in the Pod spec, what is the resulting permission of the files? Thanks!
I have hit this issue too. The following config excerpt results in files with 0440
instead of 0400
as expected:
kind: Deployment
...
spec:
template:
spec:
securityContext:
runAsUser: 472
fsGroup: 472
...
containers:
- name: fip
securityContext:
runAsUser: 0
volumeMounts:
- name: foo
mountPath: /etc/bip
...
volumes:
- name: foo
secret:
secretName: bar
defaultMode: 256
IMO, fsGroup
and defaultMode
conflicts in Secrets (and other AtomicWriter volumes). AtomicWriter.Write
prepares the volume with the right defaultMode
first, i.e. a file is chmod to 400
(oct) and only after that SetVolumeOwnership
is called to apply fsGroup:
SetVolumeOwnership
then chmods the file to 660
(or 440
for read-only files), ignoring any defaultMode
:
/reopen
@jsafrane: Reopened this issue.
@jsafrane what do you think should be the solution? Fail if both fsgroup and default mode are set? Or have default mode override fsgroup?
Or should we need to to come up with a new API to specify volume ownership that is:
The API documentation of DefaultMode
fwiw explicitly says it only guarantees file creation with given DeaultMode and specified DefaultMode may conflict with fsGroup
:
// Mode bits to use on created files by default. Must be a value between
// 0 and 0777.
// Directories within the path are not affected by this setting.
// This might be in conflict with other options that affect the file
// mode, like fsGroup, and the result can be other mode bits set.
I do not think we should change meaning of DefaultMode
just yet.
discussed with @mesaugat, one proposal to fix this is that for volume types of secrets, configMap etc, we should set up the ownership along with mode together while writing the files (see code https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/util/atomic_writer.go#L396), so that no need to call SetVolumeOwnership function after finish writing the data at all. This way, it avoids SetVolumeOwnership overwrites the defaultMode which was set up earlier.
I explored something similar in https://github.com/kubernetes/kubernetes/pull/57935/files and couldn't get it to work with containers running as different uids with an fsGroup
@liggitt thanks for the information. I think we might still honor fsGroup and defualtMode setting with the following restrictions
If user has different containers that run as different uids, they should not set defaultMode to 0400. This is considered as an invalid setting, could be validated and disallowed?
We can't currently tighten validation in a way that would invalidate data already allowed to be persisted in etcd. xref https://github.com/kubernetes/kubernetes/issues/64841
Also, we can't detect in the API validation what the natural uid is for containers that didn't specify a uid in the pod spec.
If we change how this is handled, I think it could only be done at runtime.
There should be a way how to provide e.g. ssh key as 0600
.
IMO, if user specifies Mode
or DefaultMode
, we should honor it and not to overwrite it during fsGroup
. I know that Mode
description mentions "This might be in conflict with other options that affect the file mode, like fsGroup, and the result can be other mode bits set."
, I know it may break existing apps, but why user wrote 0600
and expected 0640
instead? To me it looks like a bug in Pod object.
We can change AtomicWriter to do chown
and chmod
using fsGroup
if Mode
/DefaultMode
is not set.
Just noticed another issue https://github.com/kubernetes/kubernetes/issues/34982, maybe also relevant
@jsafrane I've been looking through the code and it appears that all uses of AtomicWriter.Write
- secrets, configmaps, downward apis, projections & cephfs - either specify a file mode directly, as with cephfs, or the default when the user specifies no modes (i.e. staging/src/k8s.io/api/core/v1/types/SecretVolumeSourceDefaultMode) are all 0644
. Also, the behaviour of pkg/volume/SetVolumeOwnership
for configmaps, downward apis & projections is to enforce that all files have at least 0440
permissions.
If we accept that if the user asks for something they should get it, then that means that the behaviour should be:
Mode
in the KeyToPath
struct then use that.DefaultMode
in the *VolumeSource
structs, then use that.DefaultMode
, then the value will be automatically set to the default for that source which is currently 0644
everywhere - 0644
is already wider than what volume.SetVolumeOwnership
would have enforced./cc
As Istio is moving to SDS by default, which uses the projected volume with the "defaultMode" set, this issue breaks our customers. It would be great to get this resolved.
/cc @incfly
Dose somebody track this issue ? How to resolved this bug?
Is there a plan to resolve this ticket? Our organisation is currently using Istio in our kubernetes cluster, we have cronjobs that are broken because the permissions on our id_rsa file has been changed and we are no longer able to clone a git repo. As the cronjobs can't run it is blocking code changes from being promoted to our production clusters which is in turn impacting our customers. I'm sure our company isn't the only consumer of istio/kubernetes that is impacted by this problem
Any updates on this? This continues to break users that are attempting to us features like ProjectedVolumeMounts with sidecar containers: https://github.com/istio/istio/issues/26882
I've now run into this issue in 2 different scenarios that are frustrating:
Is there a timeline on this issue being fixed? It appears it has existed without any resolution since January 2018.
bump. I need to mount ssh key pairs on to a pod running as an elevated privilege user to use along with an glusterfs mount.
The workaround I used for this was to run an initContainer
which mounts and SSH Private Key in a temp directory, then copies it to an emptyDir
and chmod
's it appropriately.
The container that then needs the private key mounts the same emptyDir
volume
apiVersion: v1
kind: Deployment
metadata:
name: my-example-deployment
spec:
template:
spec:
initContainers:
- name: prep-id-rsa
image: busybox:1.30
command:
- sh
- -c
- |-
cp /tmp/id_rsa /well/known/dir/id_rsa
chmod 0600 /well/known/dir/id_rsa
volumeMounts:
- name: id-rsa
mountPath: /tmp/id_rsa
subPath: id_rsa
- name: empty-dir
mountPath: /well/known/dir
containers:
- name: my-container
image: alpine:3
command:
- sh
- -c
- |-
ls -la /tmp/id_rsa
ls -la /root/.ssh
volumeMounts:
- name: id-rsa
mountPath: /tmp/id_rsa
subPath: id_rsa
- name: empty-dir
mountPath: /root/.ssh
resources:
requests:
cpu: 1m
memory: 1Mi
limits:
cpu: 1m
memory: 1Mi
securityContext:
runAsUser: 1000
volumes:
- name: id-rsa
secret:
secretName: my-ssh-private-key
- name: empty-dir
emptyDir:
medium: memory
I think the next step in investigating a solution for this issue is to see if we can generalize the work done for projected service account tokens to other volume types.
/cc
/cc
For anyone who is following this thread: we are planning to resolve this issue in 1.22 release. Here is the enhancement: https://github.com/kubernetes/enhancements/pull/2606. Feel free to take a look if you are interested. Thanks!
I saw that the pull request above is closed rather than merged, does it mean it is not resolved?
/triage accepted
The problem is not that defaultMode is not respected when runAsUser is set. It is not respected when fsGroup is set, which is documented on one hand. On the other hand, it would be nice for the most specific setting to take precedence.
The issue here is that fsGroup set means all volumes should have this GID and read permissions for the group. When defaultMode is set, this mode is used. When both are set, it is not obvious what should take precedence.
This was documented in the KEP and in the defaulMode field documentation. But I do think the bahviour might be better if the "closest" setting to the volume takes precedence.
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
/triage accepted
(org members only)/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
/triage accepted
The copy-key workaround is fine if it's for short-term. We have a situation where tls cert and private key are mounted from secrets, and copied to another directory and chmod the keys to 400. However it becomes an issue, as when cert are rotated, since Secrets are not directly mounted, the cert rotation requires additional work to copy them over again. Is anyone aware of it? Would like to see the root cause addressed.
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
Setting
Pod.spec.securityContext.runAsUser
causes the group read permission bit to be set on secrets exposed via volumes, even ifPod.spec.volumes[x].defaultMode
is set to256
.See also: https://github.com/openshift/origin/issues/16424
What you expected to happen:
Given a
defaultMode
of256
the file mode should be0400
but it is0440
instead.How to reproduce it (as minimally and precisely as possible):
Create the following objects and observe the logs of the created pod:
Anything else we need to know?:
Environment:
kubectl version
):uname -a
):