aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

Add support for userns in ecs-agent and TaskDefinition #774

Open fischaz opened 4 years ago

fischaz commented 4 years ago

Summary

we wish to enable UserNS remap support in our docker setup using ECS for security. The datadog agent container requires '--userns=host' when running in that mode, which is currently not supported by TaskDefinition

Description

UserNS remap is documented in https://success.docker.com/article/introduction-to-user-namespaces-in-docker-engine

activation of the mode is easy and it generally works. But if the ECS service (like datadog agent) requires --pid=host (to monitor all processes on the EC2 instance), when using userns-remap, the container must also run with --userns=host otherwise, Docker will fail to start the container with the following error:

level=error msg="Handler for POST /v1.21/containers/create?name=ecs-ecs-int-datadog-agent-70-datadogagent-94c1bb86b1898abfdd01 returned error: cannot share the host's network namespace when user namespaces are enabled"

docker run supports the flag as per https://docs.docker.com/engine/reference/commandline/run/

it was mentioned in aws/amazon-ecs-agent#502 but never implemented (probably due to a lack of requests).

Environment Details

Supporting Log Snippets

Docker logs issues with datadog agent:

2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.369910602Z" level=info msg="Starting up"
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.370051577Z" level=info msg="User namespaces: ID ranges will be mapped to subuid/subgid ranges of: dockremap:dockremap"
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.372721442Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.373519096Z" level=info msg="User namespaces: ID ranges will be mapped to subuid/subgid ranges of: dockremap:dockremap"
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.376523753Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.376538082Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.376554880Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.376562869Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.377827080Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.377843346Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.377857174Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.377871244Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.407009277Z" level=info msg="Loading containers: start."
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.502508969Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.543803565Z" level=info msg="Loading containers: done."
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.562232469Z" level=info msg="Docker daemon" commit=369ce74a3c graphdriver(s)=overlay2 version=19.03.6
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.566166969Z" level=info msg="Daemon has completed initialization"
2020-02-26T21:18:59Z 030_docker time="2020-02-26T21:18:59.617651310Z" level=info msg="API listen on /var/run/docker.sock"
2020-02-26T21:19:19Z 030_docker time="2020-02-26T21:19:19.620968921Z" level=error msg="Handler for POST /v1.21/containers/create?name=ecs-ecs-int-datadog-agent-70-datadogagent-94d889ac99de97ffc701 returned error: cannot share the host's network namespace when user namespaces are enabled"
2020-02-26T21:20:03Z 030_docker time="2020-02-26T21:20:03.136315849Z" level=error msg="Handler for POST /v1.21/containers/create?name=ecs-ecs-int-datadog-agent-70-datadogagent-94f5c09fdca3e5c0f501 returned error: cannot share the host's network namespace when user namespaces are enabled"
2020-02-26T21:20:42Z 030_docker time="2020-02-26T21:20:42.967553698Z" level=error msg="Handler for POST /v1.21/containers/create?name=ecs-ecs-int-datadog-agent-70-datadogagent-94f1ac98a89395cf3000 returned error: cannot share the host's network namespace when user namespaces are enabled"
fischaz commented 4 years ago

Hello again,

I'm not sure that would be the same place to do it, but as I'm digging deeper into it. AWS Batch is affected by a different issue (but similar).

I've managed to configure a Batch AMI with the UserNS remap setting on the docker daemon and submitted a batch job.

the instance starts and all, but fails to execute the docker container with the following error in the docker daemon logs:

2020-02-27T04:11:49Z 030_docker time="2020-02-27T04:11:49.111185541Z" level=error msg="Handler for POST /v1.19/containers/create?name=ecs-BatchJobDefinition-ecs-batch-job-rdsrestorer-13-default-82d8aac7eddac2dd8401 returned error: cannot share the host's network namespace when user namespaces are enabled"

I've looked at my BatchJob Definition:

{
    "jobDefinitionName": "BatchJobDefinition-ecs-batch-job-job1",
    "jobDefinitionArn": "arn:aws:batch:ap-southeast-2:123456789012:job-definition/BatchJobDefinition-ecs-batch-job-job1:70",
    "revision": 70,
    "status": "ACTIVE",
    "type": "container",
    "parameters": {},
    "retryStrategy": {
        "attempts": 1
    },
    "containerProperties": {
        "image": "123456789012.dkr.ecr.ap-southeast-2.amazonaws.com/image:1.2.0",
        "vcpus": 1,
        "memory": 128,
        "command": [],
        "jobRoleArn": "arn:aws:iam::123456789012:role/ecs-batch-job-ContainerRole",
        "volumes": [],
        "environment": [
            {
                "name": "NO_PROXY",
                "value": "169.254.169.254,169.254.170.2"
            },
            {
                "name": "HTTPS_PROXY",
                "value": "webproxy:3128"
            },
            {
                "name": "HTTP_PROXY",
                "value": "webproxy:3128"
            }
        ],
        "mountPoints": [],
        "readonlyRootFilesystem": true,
        "privileged": false,
        "ulimits": [],
        "resourceRequirements": []
    }
}

and I don't think I set the network settings here. So I assume that the AWS Batch service (when creating the ECS TaskDefinition to be run in the OnDemand ECS Cluster (Batch Managed) is the one setting the networking to host: image

overall, the network mode of host has the same issue as the pidmode to host... Docker daemon will refuse to run that unless --userns is set to host.

https://docs.docker.com/engine/security/userns-remap/#user-namespace-known-limitations

which really in the case of Batch, is more or less the only mode.

I guess if a AWS customer is fully managing the ECS cluster (for batch and non-batch jobs) and configure the EC2 to use userNS-remap, then it would make sense for Batch to be aware of that and pass the userNS=host setting as well to the ECS cluster.

In my case though, I'll just never enable userNS-remap on the batch EC2 instance (no point of that if batch always use the network=host flag and thus would always use the userns=host flag later on (why enable the remap if it's never remapped)...

I thought I'd just share some feedback.

jwfh commented 4 years ago

This would be a very useful feature for a number of applications that need access to root-owned host level files and UNIX sockets for monitoring and security.

@coultn, I'm hoping this issue will get some attention and make it onto the container-roadmap.

gitfool commented 3 years ago

I'm also hitting this issue. There really needs to be an escape hatch for tasks run as specialized daemon sets! 🙏

lukehalley commented 2 years ago

Any update on the above?