Closed Birdrock closed 3 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/175827405
The labels on this github issue will be updated when the story is started.
Hey @Birdrock,
Thanks for the PR! From my point of view, the suggested approach looks good.
As per your doubt - where the resolution of a non-numeric user should take place - I believe that Eirini is the right place to do that. In Diego CF, CC/Diego sends a request to Garden to run a process as user
, where the user can be either user id (a number) or user name. If it is a user name, then Garden resolves the name to user ID and uses that user ID to execute the process. This flow is transparent to CC and Diego and I believe that this should remain like that in the K8S world as well.
As you pointed out, defaulting the user ID to 1000
if the value in the request is not a parseable number is not really robust. In the CF Diego world, Garden would spin up a dummy pea container for the requested docker image and look up the user name in /etc/passwd
and /etc/group
files on the container root file system. I wonder whether we can implement something similar in K8S - maybe spin a dummy pod and peek on its filesystem... what do you think @cloudfoundry-incubator/eirini?
The PR itself has some rough edges like unhandled errors, misleading test names, etc. but as it is a preliminary PR, commenting on those does not really make sense.
@danail-branekov Glad I'm on the right basic approach. Thank you for the comments.
Spinning up a dummy container to inspect /etc/passwd
and /etc/group
sounds interesting. It is a bit out of my depth currently, but I can give a try at it, if y'all can point me in the right direction. Alternatively, I can clean up the branch a bit, and you can implement the dummy container checking.
Agreed on the rough edges of the PR. I wanted to get something in front of you sooner than later. Again, thank you for commenting on this - I'm looking forward to collaborating further on implementing this.
Hey @Birdrock,
Running a dummy container on K8S has certain things to consider, e.g.
/etc
files on a running container requires root privileges. Unfortunately K8S does not support user namespace isolation, i.e. running a process as a container root on K8S means that the process runs as host root, which is a significant security concern, especially when running untrusted code in the docker imageWe have created a spike story in our backlog to look for feasible implementation, hopefully it would give us some ideas how to proceed.
Hey @danail-branekov,
That all makes sense. I was thinking that an init container might be a place to look, but there is still some wiring that would need to be done as a callback to return the mapped user ID.
As far as needing to run as root - that was one of my concerns as well. My understanding was that Kubernetes specifically requires a numeric user ID because it doesn't want to expose itself to this scenario.
Thank you for the story link - I'm excited to see what y'all find. Running arbitrary Docker images in cf-for-k8s
has been something of a sore spot recently due to RunAsUser
being removed from the SecurityContext
. As great as the cf push
source to running app is, there seems to be a significant use case for pushing pre-built Docker images, especially in the Kubernetes ecosystem.
@gcapizzi @danail-branekov
I see that the spike has been completed. I'll look at the findings, but it appears that the cost/benefit ratio is high on inspecting the image file system. Where does that leave the inspection of the image in staging to be used int the SecurityContext
?
I apologize for the radio silence on my side. Between holidays and moving apartments, I've been out of the loop for a bit.
Hey @Birdrock! Here are my notes on the spike story we ran to verify the feasibility of this:
- Any implementation we could come up with would be more complex than anything we have in Eirini right now.
- Any users who own the image they are pushing can change the image to use a numeric user id.
- Users who do not own the image will still have issues with the large majority of Docker images that do not specify any
USER
and thus fallback to0
(root
), which will be denied byrunAsNonRoot
.- Users who do not own the image can still create a new image based on the main one but overriding the
USER
to be numeric.runAsNonRoot
is a workaround for the lack of user namespaces in Kubernetes. Diego/Garden have user namespaces, which means they can safely run images with user0
/root
(as containerroot
is mapped to an unprivileged user on the host) instead of having to reject them.runAsNonRoot
already prevents us from reaching feature parity with Diego/Garden for most Docker images (which run asroot
/0
). Supporting images with non-numeric users would only solve the problem for (we believe) a small number of use cases, at a very high cost.
I'd stress 5 in particular: we'll never be able to behave like Diego/Garden because they have user namespaces and we don't :(
@gcapizzi Thanks for the update.
I have a couple more thoughts on avenues to investigate.
runAsUser
if the desire specifies a UID. There may be cases where we want to specify a specific user. Any configurable logic or defaulting could occur in the docker staging side to keep the StatefulSet
logic clean./etc/passwd
& /etc/groups
. It may be possible to pull the image tgz, explode it, the explode the layer tgz archives specified in the manifest. This may avoid security concerns. Additionally, if this logic is located in the docker staging side, it would be largely decoupled from the rest of the Eirini code.I'll do some exploration and let you know if something looks promising.
Preliminary PR as per discussion with Giuseppe Capizzi - https://cloudfoundry.slack.com/archives/C8RU3BZ26/p1605550105243100?thread_ts=1605540085.240900&cid=C8RU3BZ26
The functionality here is to extract the user from the image and pass it the CC on
docker_staging
. On the desire side, it is included with the payload and used for thePodSecurityContext
as theRunAsUser
.Due to
RunAsNonRoot
, theRunAsUser
requires a numeric UID. In this implementation, there is a naive check to see if the user is parseable, then assigns a default UID if is not. I'm unsure if this functionality belongs inStatefulSet
generation or whether CC should do this conversion. One one hand, the numeric UID requirement is a result of Kubernetes, so it isn't really a CC concern. On the other hand, CC is the hub of operations, so it should maybe understand the requirements of desires to Eirini.