Open michaelbannister opened 4 years ago
/cc @mauilion
/assign
Can you define what "kubernetes" is when not KIND? KIND is also kubernetes :+) It might be containerd vs docker as the node runtime, or something with KIND ...
will investigate O(soon)
In my case I tested this against the Kubernetes installed by Docker Desktop on macOS. I might be able to try it on GKE, will get back to youβ¦
GKE v1.14.8-gke.33 with Docker as the container runtime, when running the job defined in job.yaml:
Error: failed to start container "distroless-permissions-test": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied": unknown
repro job.yaml working in kind locally on my linux workstation.
doing some follow up on block device testing issues plauging k8s CI (hopefully fixed now, and they don't work at all in other local clusters .. :upside_down_face: ) ref: https://github.com/kubernetes-sigs/kind/issues/1248 then back to this one ...
I see some upstream bugs related to this but they appear to be fixed before 1.15
Sorry, just to be sure: when you say working - you mean it's refusing to start the container in the way I described? Or that it is running the container in kind?
Feels like it could be something to do with Linux capabilities? (which I barely understand TBH)
er it is creating the container and the job exits success with kubectl apply -f https://raw.githubusercontent.com/michaelbannister/distroless-permissions-test/master/job.yaml
OK, but if you run the same job on GKE it will fail to run with the error I've shown. Ditto if you just ask Docker to run it as a different user: docker run --rm -it -u 1337:1337 michaelbannister/distroless-permissions-test
.
This discrepancy doesn't look right, which is why I've raised the issue.
However, Kind's behaviour changes if you drop all capabilities, as in job-drop-caps.yaml β it will not run the container. So I wonder if there is something about capabilities (I was trying to read up on effective, ambient, permitted caps etc but I gave up in confusion).
Right, I'm saying that the bug is confirmed π
On Thu, Feb 13, 2020, 12:23 Michael Bannister notifications@github.com wrote:
OK, but if you run the same job on GKE it will fail to run with the error I've shown. Ditto if you just ask Docker to run it as a different user: docker run --rm -it -u 1337:1337 michaelbannister/distroless-permissions-test. This discrepancy doesn't look right, which is why I've raised the issue.
However, Kind's behaviour changes if you drop all capabilities, as in job-drop-caps.yaml https://github.com/michaelbannister/distroless-permissions-test/blob/master/job-drop-caps.yaml β it will not run the container. So I wonder if there is something about capabilities (I was trying to read up on effective, ambient, permitted caps etc but I gave up in confusion).
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/kind/issues/1331?email_source=notifications&email_token=AAHADK6EELVDWOPR77XT7SDRCWT4TA5CNFSM4KUXZJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELWPI5Q#issuecomment-585954422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADKZ4YAOD5YDWMDC5KRDRCWT4TANCNFSM4KUXZJSQ .
Or rather, that it's locally reproducible for me, which should make debugging easier.
On Thu, Feb 13, 2020, 12:28 Benjamin Elder bentheelder@google.com wrote:
Right, I'm saying that the bug is confirmed π
On Thu, Feb 13, 2020, 12:23 Michael Bannister notifications@github.com wrote:
OK, but if you run the same job on GKE it will fail to run with the error I've shown. Ditto if you just ask Docker to run it as a different user: docker run --rm -it -u 1337:1337 michaelbannister/distroless-permissions-test. This discrepancy doesn't look right, which is why I've raised the issue.
However, Kind's behaviour changes if you drop all capabilities, as in job-drop-caps.yaml https://github.com/michaelbannister/distroless-permissions-test/blob/master/job-drop-caps.yaml β it will not run the container. So I wonder if there is something about capabilities (I was trying to read up on effective, ambient, permitted caps etc but I gave up in confusion).
β You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/kind/issues/1331?email_source=notifications&email_token=AAHADK6EELVDWOPR77XT7SDRCWT4TA5CNFSM4KUXZJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELWPI5Q#issuecomment-585954422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHADKZ4YAOD5YDWMDC5KRDRCWT4TANCNFSM4KUXZJSQ .
on GKE with a COS containerd 1.13.12-gke.25 node pool kubectl apply -f https://raw.githubusercontent.com/michaelbannister/distroless-permissions-test/master/job.yaml
also results in:
bentheelder@cloudshell:~ (bentheelder-kind-dev)$ kubectl get po
NAME READY STATUS RESTARTS AGE
distroless-permissions-test-z87qq 0/1 Completed 0 6s
looked at this with @Random-Liu a bit just now, possibly a containerd bug?
If this becomes a major issue we can work back in dockerd support for the nodes, but I think this is some subtle difference involving either a bug in containerd/containerd-cri or docker/dockershim and we should probably get it fixed upstream.
per @Random-Liu appears to be a difference in default capability list (?)
Fix pending in https://github.com/containerd/cri/pull/1397
We'll pull that into kind once it merges in containerd.
Hi @BenTheElder, when I tried to install my application on KIND cluster, application doesn't come up because it's failing with permission denied on '/opt' directory. Application uses the non-root user.
Not sure, the issue which I am running into is related to this ticket.
Is your pending fix will address this issue?Any suggestion on this. please, let me know
Hi, can you tell me more about your application setup?
On most hosts /opt
is owned by root, I would not expect a non-root user to be able to write to a system directory like this, in which case the issue is with your application deployment, not kind/containerd/....
For example on my workstation:
$ stat /opt
File: /opt
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fd01h/64769d Inode: 2223873 Links: 8
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-01-06 16:10:50.162169909 -0800
Modify: 2019-04-03 11:03:01.281234979 -0700
Change: 2019-04-03 11:03:01.281234979 -0700
Birth: -
(also please use a new support type ticket for this, thanks!)
upstream patch is LGTM but not merged yet. still monitoring
further discussion notes that the default capabilities have changed over time in container runtimes and may change again. while cri-containerd would prefer to match dockerd and does consider this a bug, explicit capabilities should be preferred when capabilities are needed.
we're already regularly upgrading containerd to the latest patches against the latest release branch constantly, when this merges there we'll pick it up.
sending another poke there and closing this out to track upstream.
Follow-up: The PR to containerd/cri fell through (OP moved on to different work), since containerd/cri merged into containerd/containerd I've sent a carry in https://github.com/containerd/containerd/pull/4669.
Once that's in this will actually be fixed.
I tried to carry forward the upstream change in https://github.com/containerd/containerd/pull/4669 but it's stuck and I'm running low on bandwidth to keep after this. I think https://github.com/opencontainers/runc/pull/2712 is relevant and may have fixed the issue, I've not had time to really look.
What happened: An image whose WORKDIR is set to a directory with permissions only for one user, run in a Pod with securityContext.runAsUser set to a different UID. Kind runs the pod just fine, but Kubernetes fails with an error like
failed to create containerd task: OCI runtime create failed: container_linux.go:346: starting container process caused \"chdir to cwd (\\\"/home/nonroot\\\") set in config.json failed: permission denied\": unknown
Kind only fails in the same way as "normal" Kubernetes if the securityContext is also configured to drop all capabilities.
What you expected to happen: Kind should fail to run the container in the same way as Kubernetes.
How to reproduce it (as minimally and precisely as possible): See https://github.com/michaelbannister/distroless-permissions-test for a worked example.
Anything else we need to know?: This came up while working on this Istio PR: https://github.com/istio/istio/pull/20854
Environment:
However this also occurs on the testing infrastructure for the Istio project. I don't know the details for that other than that it uses Kind.