Open Alex-Richman opened 4 years ago
Hello!
To run a container an image as non-root, a customer can do the following:
They can export the path that they want to export as a VOLUME
and run chown in their own Dockerfile or ImageFile.
As an example, Let’s consider an image that has node as the base image (node will use nodejs environment) and it wants to use /var/log/exported in node:node as the user and group respectively. Now, they can specify VOLUME directive to /var/log/exported and then all the permissions will be reflected in their task volumes.
To understand how this can be achieved, let us look at the following Dockerfile.
FROM node:12-slim ## A node js base image
RUN chown node:node /var/log/exported ## Changing permissions from root to node
VOLUME ["/var/log/exported"] ## Specifying a VOLUME directive applies the permission
Please let us know if it works for you. Thank you
RUN chown node:node /var/log/exported ## Changing permissions from root to node VOLUME ["/var/log/exported"] ## Specifying a VOLUME directive applies the permission
This worked fine till Platform Version 1.3.0 but fails with 1.4.0 Now when 1.4.0 will be the new default, need a new workaround
I crashed a stack on that issue, something is really needed here. As I had this issue I was not using version 1.4.0. I didn't fill PlatformVersion and got 1.3.0 previously by default, now that changed to 1.4.0, which still have this issue.
I ended up using an EFS volume
Hello,
We have recently updated our documentation with some examples on how to use bind-mounts. These examples include:
If this does not solve your use-case, please reach out to AWS Support.
Thanks Manu
I've looked closely at the Dockerfile my application uses, and compared it to the examples. While there are differences, as far as I can tell they are semantically equivalent, yet do not work with Fargate bind mounts. The process is ultimately unable to interact with the mounted volume.
Has anyone actually be able to make this work?
Hi Matt and Gökhan,
Can you please open a support case, so that we can work with you to understand why this does not work?
Thanks Manu
@manugupt1 I'm afraid not, as technical support isn't covered by my current service level. However, I can share the Dockerfile with you, in case you want to inspect the differences.
Hi Matt, I looked through your Dockerfile and found that a symlink is the present for VOLUME directive. Ref: https://github.com/apache/druid/blob/753bce324bdf8c7c5b2b602f89c720749bfa6e22/distribution/docker/Dockerfile#L38
We have updated our documentation to say that the VOLUME directive should map to an absolute path. This can be found at https://docs.aws.amazon.com/AmazonECS/latest/developerguide/bind-mounts.html#bind-mount-considerations
Thanks Manu
@manugupt1 that was really awesome of you. Thanks so much for digging in. I'll bring this up with the dockerfile maintainers.
To me, this is still a very valid request.
We have an ecs service based on 3 public Bitnami containers. We want to exchange volumes between these containers one for config one for data. Al these containers run as the same non-root user. Yet in fargate we cant run this setup because we can not manipulate the uid/gid of the named volume. In this case, we have no control over the docker files so we can expose a volume that way.
Hello, Thanks for reaching out. Currently, this can be achieved by using an init-container. A task definition will require a non-essential container to be run before all other containers using dependsOn clause. This container will set the appropriate permissions for the images running in the non-root uid. Once it scopes down the permission, it will exit and start the other containers allowing you to write into these volumes.
A simple scaffolding of the task def can be as follows:
{
...
"containerDefinitions": [
...
{
"name": "permissions-init",
"image": "busybox:latest", # Can be any image that has chmod / chown
"entryPoint": [
"sh",
"-c"
],
"command": [
"sudo chmod 0777 /example-vol" # one option
"sudo chown 1000:1000 /example-vol" # another option
],
"mountPoints": [
{
"containerPath": "/example-vol",
"sourceVolume": "example-vol"
}
],
"essential": false # Required
},
{
"name": "essential-container",
"image": "busybox:latest",
"dependsOn" : [
"containerName": "permissions-init",
"condition": "SUCCESS" # or COMPLETE, depending on the use-case
],
}
...
],
"volumes": [
{
"name": "example-vol",
}
],
...
}
I ended up slogging through this a bit and ended up with something similar to @manugupt1's example except converted to pull a container from our internal ECR (which is allowed) instead of Docker Hub (which is firewalled and also not reliable due to rate-limiting) and with logging enabled.
Looking at our containers, I think there are two things which would make it a lot easier to enable read-only root filesystems. One would be removing the need for permissions-changing containers by allowing the service to specify those permissions as part of the bind mount configuration:
"volumes": [
{
"sourceVolume": "java-tmp",
"containerPath": "/var/run/search",
"user": "search-service",
"group": "5555",
"mode": "0755"
}
]
The other would be more complicated but it would be really nice if there was a way to specify an overlay mount over an existing mount point. We have a handful of things where something creates new files next to the distributed source at start up (e.g. Python creating pyc / __pycache__
(yes, I know about compileall
but don't want to patch hundreds of containers), or a Java program compiling extensions on startup) and it would be handy for those applications if I could mount an overlay on top of them until we can get the developers to redesign them to use separate storage.
I consider this a bug, because it works with ECS/EC2 AND with EFS!
As in, in both ECS/EC2 and with EFS, the mount point is "magically" owned by the non-privileged user the task is started as.
However, in ECS/Fargate, the mount point is owned by root:root
, making it useless. ESPECIALLY with the 0755
mode!
EITHER use the containerDefinition.user
value for the mount point (won't allow mode
to be set, but 2775
is what WE want, which I think is a resonable [default] value here) OR allow for additional config as exampled by acdha above. Although, having separate user
and group
options might be overkill considering that containerDefinition.user
already exists. In that example, only mode
would be needed. AND possibly additional mount option
perhaps..
Using the VOLUME
option when creating the container would require a rebuild of the image (with additional overhead in testing and QA - EVEN THOUGH IT'S ONLY A MINOR CHANGE!!) when switching from EC2 to Fargate - we're in the process of changing ALL (or most?) of all our services over to Fargate to try to cut some cost.
I see this issue is "Coming Soon" on the roadmap. Is there any ETA for fix? Trying to figure out whether or not we need to identify an alternative solution or wait for the fix.
The workaround isn't to difficult. Setup a VOLUME
in the Dockerfile
with the path of the directory. Create it, chown and chmod it first..
You'll get double mounts in the container, but it turns out that the "real" one, the one with the correct user:group ownership and mode is the "last" one.
I couldn't wait, so we'll go with the workaround. For now.
Hi, I found a better solution. Once the container runs and fails, check the cloudwatch logs of the execution. you will see what aws ecs rest api uses as group for the mountpoint, in my case it was 985 or something like this. I have added the group in my dockerfile. After that my user (non root required) was able to access the mountpoint.
Did that some how change the mode of the mount point? It's 0755 on mine (without that group), which means that group don't have access to [write to] it anyway.. But does it (by adding the group) change the mode as well?
Yes, I got the docker socket mounted and faced the same issue as described here. After adding the group it worked without double mounts.
In my dockerfile:
RUN groupadd -o -g 994 dockersock
RUN usermod -aG dockersock "runner"
"Docker socket mounted".. Is this on ECS/Fargate really? How do you get to the socket, it's serverles architecture..
To be honest it's ECS but not fargate. On the other hand, just to clarify that "Serverless" is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. The docker mount point was just an example of how you can identify the mount permission of the ECS container and fix it in your Dockerfile. Have you tried looking in the container logs after adding a command block to list the mounts and grep your mount point to see what permissions it requires ?
command = [
"/bin/mount|grep "your_mount_point"
]
It is at this point your container gets to perform the mount not when you build your Dockerfile :)
Ok, yeah. That's the thing about this ticket - ECS works correctly, but Fargate does not!
They SHOULD (imo!) work exactly the same, but they don't :(. Please read the WHOLE ticket and you'll see the problem.
The "serverless" is the correct word for this. And Lambda. WE, as users, don't have any access to "servers". That EVERYTHING, including R53 zones, SNS topics, AI etc and everything else, MUST run on a server "somewhere" is besides the point. It's serverless - FOR US!
I'm also tracking this issue, as I've had to resort to workarounds due to its impact. It's surprising to see that such a significant problem still remains unresolved after more than three years.
Hey there, I'm from the Fargate team. Apologies for the confusion on this issue. The below is an example of how to configure bind mounts owned by non-root users on Fargate PV 1.4 without the need for an init container. Please let us know if there are still gaps in what you are trying to achieve when using the below.
$ cat Dockerfile
FROM public.ecr.aws/amazonlinux/amazonlinux:2
RUN yum install -y shadow-utils && yum clean all
RUN useradd node
RUN mkdir -p /var/log/exported && chown node:node /var/log/exported
USER node
RUN touch /var/log/exported/examplefile
VOLUME ["/var/log/exported"]
$ cat taskdef.json
{
"containerDefinitions": [
{
"name": "c1",
"mountPoints": [
{
"containerPath": "/var/log/exported",
"sourceVolume": "myvol"
}
],
"command": [
"sh",
"-c",
"whoami; ls -l /var/log/exported; touch /var/log/exported/c1.txt; for i in 1 2 3 4 5; do ls -l /var/log/exported; sleep 2; done"
],
...
},
"name": "c2",
"mountPoints": [
{
"containerPath": "/var/log/exported",
"sourceVolume": "myvol"
}
],
"command": [
"sh",
"-c",
"whoami; ls -l /var/log/exported; touch /var/log/exported/c2.txt; for i in 1 2 3 4 5; do ls -l /var/log/exported; sleep 2; done"
],
...
}
],
"volumes": [
{
"name": "myvol"
}
],
...
}
The logs from container 1
$ whoami
node
$ ls -l /var/log/exported
-rw-r--r-- 1 node node 0 Sep 20 20:06 examplefile
$ touch c1.txt
$ ls -l /var/log/exported
-rw-r--r-- 1 node node 0 Sep 20 21:48 c1.txt
-rw-r--r-- 1 node node 0 Sep 20 20:06 examplefile
# After c2 starts we see the c2.txt file.
$ ls -l / var/log/exported
-rw-r--r-- 1 node node 0 Sep 20 21:48 c1.txt
-rw-r--r-- 1 node node 0 Sep 20 21:48 c2.txt
-rw-r--r-- 1 node node 0 Sep 20 20:06 examplefile
The logs from container 2
$ whoami
node
$ ls -l /var/log/exported
-rw-r--r-- 1 node node 0 Sep 20 20:06 examplefile
-rw-r--r-- 1 node node 0 Sep 20 21:48 c1.txt
$ touch c2.txt
$ ls -l /var/log/exported
-rw-r--r-- 1 node node 0 Sep 20 21:48 c1.txt
-rw-r--r-- 1 node node 0 Sep 20 21:48 c2.txt
-rw-r--r-- 1 node node 0 Sep 20 20:06 examplefile
Hey there, I'm from the Fargate team. Apologies for the confusion on this issue. The below is an example of how to configure bind mounts owned by non-root users on Fargate PV 1.4 without the need for an init container. Please let us know if there are still gaps in what you are trying to achieve when using the below.
I just hit an example of that deploying a third-party container on ECS where I don't have an easy way to override their container definition without forking the Dockerfile to add something like your chown
step before it changes users. Right now that means I do something like this (from the deployment of the OWASP DependencyTrack app I was just working on):
{
"name": "set-mount-permissions",
"image": "public.ecr.aws/amazonlinux/amazonlinux:2023",
"essential": false,
"mountPoints": [
{
"containerPath": "/data",
"sourceVolume": "dependency-track"
}
],
"command": [
"install -d -o 1000 -g 1000 -m 0775 /data/.dependency-track"
]
},
{
"dependsOn": [
{
"containerName": "set-mount-permissions",
"condition": "SUCCESS"
}
],
"name": "DependencyTrack",
…
It would be convenient if instead I could just do something like this:
"mountPoints": [
{
"containerPath": "/data",
"sourceVolume": "dependency-track",
"readOnly": false,
"permissions": {
"owner": "1000",
"group": "1000",
"mode": "775"
}
}
],
or, even better, have some kind of magic value which could be used to say “the same UID/GID as the container's USER
”:
"mountPoints": [
{
"containerPath": "/data",
"sourceVolume": "dependency-track",
"readOnly": false,
"permissions": {
"owner": "USER (or maybe CONTAINER-USER?)",
"group": "USER",
"mode": "775"
}
}
],
It would be great if ECS Fargate made the owner/permission configurable for container mounts for bind mount volumes, or at least made the default value more useful (world-writable e.g., similar to Kubernetes emptyDir volumes).
We are using the image volume (Dockerfile VOLUME) workaround to deploy read-only-root-filesystem non-root containers in ECS Fargate 1.4.0 with a writable tmpdir filesystem.
But we'd prefer to keep this configuration in the deployment where it belongs. Baking the image volume into our application images means that we're forced to use that volume if we want to deploy the image in other contexts (Kubernetes e.g.)
Community Note
Problem: Ephemeral storage for Fargate tasks with readonlyRootFilesystem and a non-root user.
On Fargate platform 1.3.0 this was achievable (in an undocumented/unintentional manner) by configuring a docker local volume at the task level and mounting it to
/tmp
in each service:(resulting in a world-writable tmp directory mounted to /tmp/ within the container)
On Fargate platform 1.4.0 docker local volumes are completely unavailable, and the new (officially recommended) way of implementing ephemeral storage for Fargate tasks is using a bind mount [1].
The problem with using a bind mount is that ECS mounts it as writable only by
root
, so a container running as a non-root user is unable to write any temporary files. Having the container run as root is generally undesirable for security reasons, though practically I expect the impact on ECS is limited since a root-based container escape would just dump an attacker into the ECS host which is presumably heavily sandboxed.The ideal solution would be for ECS to support configuring permissions on bind mounts, or better still support tmpfs on Fargate [2][3].
[1] https://docs.aws.amazon.com/AmazonECS/latest/developerguide/fargate-task-storage.html [2] https://github.com/aws/containers-roadmap/issues/736 [3] https://github.com/aws/containers-roadmap/issues/710