Version 3 restart policy

webchi commented 5 years ago

In version 3 compose has another restart configuration

version: "3"
services:
  redis:
    image: redis:alpine
    deploy:
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 120s

And this configuration doesn't make changes in docker inspect:

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}:RestartPolicyName={{ .HostConfig.RestartPolicy.Name }} MaximumRetryCount={{ .HostConfig.RestartPolicy.MaximumRetryCount }}'

34f93a53b8c:RestartPolicyName= MaximumRetryCount=0
d1bce816b2cj:RestartPolicyName= MaximumRetryCount=0
410e4cbfa0h:RestartPolicyName= MaximumRetryCount=0
91a65a03d2:RestartPolicyName= MaximumRetryCount=0

So 5.14 test is failed =\

konstruktoid commented 5 years ago

Hi @webchi, this seems related to #319 and we can't check if a container is started with docker-compose and its options :/

shawngmc commented 5 years ago

@konstruktoid The information is available, at least when the docker host is in swarm mode (I don't have a non-swarm host handy to check.)

Use "docker stack ls" to get all stacks.
For each stack, use "docker stack services STACKNAME --quiet" to get all services.
For each service, use "docker service ps -q SERVICEID" to get the task IDs.
For each task, use "docker inspect --format '{{.NodeID}} {{.Status.ContainerStatus.ContainerID}}' TASKID" to get the full container ID.
Correlate the container ID with the container you are checking. Either waive the test, or check the restart policy for the service via "docker service inspect SERVICEID --format '{{.Spec.TaskTemplate.RestartPolicy}}'"

I suspect this may even let you check all containers in the swarm - even on other hosts - but I'm not 100% sure on that.

konstruktoid commented 5 years ago

Long time no see @shawngmc, time for an interesting twist. Examples are based on the redis compose file above.

docker service inspect $(docker service ls -q) --format '{{.Spec.TaskTemplate.RestartPolicy}}'

will return {on-failure 5s 0xc0002e93c0 2m0s} and that's great.

Same thing for each container:

for r in $(docker service ls -q); do docker inspect --format '{{.ID}} {{.Spec.RestartPolicy}}' $(docker service ps -q "$r"); done

nr7xltm2v0orqfcewha99uqo9 {on-failure 5s 0xc0005d1380 2m0s}
ki7vm7cg1qy1y5zgw475p6y14 {on-failure 5s 0xc0003b2158 2m0s}
q9gxe7gvicz4pnef61lmdcte1 {on-failure 5s 0xc0005d1998 2m0s}
jnnjman1czt7gcq08buogijsy {on-failure 5s 0xc0005d1e48 2m0s}

The issue is that the containers no longer shows up with the docker command.

$ docker ps -qa | wc -l
0
$ docker service ls -q | wc -l
1

They do however if you use a basic docker-compose.yml:

version: '3'
services:
  redis:
    image: "konstruktoid/nginx"
    container_name: compose_test
    restart: on-failure:5

$ docker inspect --format '{{ .Name }} {{ .HostConfig.RestartPolicy.Name }}:{{ .HostConfig.RestartPolicy.MaximumRetryCount }}' $(docker ps -q)
/compose_test on-failure:5

docker-bench-security output:

[WARN] 5.10  - Ensure that the memory usage for containers is limited
[WARN]      * Container running without memory restrictions: compose_test
[WARN] 5.11  - Ensure CPU priority is set appropriately on the container
[WARN]      * Container running without CPU restrictions: compose_test
[WARN] 5.12  - Ensure that the container's root filesystem is mounted as read only
[WARN]      * Container running with root FS mounted R/W: compose_test
[PASS] 5.13  - Ensure that incoming container traffic is bound to a specific host interface
[PASS] 5.14  - Ensure that the 'on-failure' container restart policy is set to '5'
[PASS] 5.15  - Ensure the host's process namespace is not shared
[PASS] 5.16  - Ensure the host's IPC namespace is not shared
[PASS] 5.17  - Ensure that host devices are not directly exposed to containers

shawngmc commented 5 years ago

I suspect this is a side effect of swarm orchestration and how it's deployed. Are you testing this on a single node docker swarm?

On a single-node swarm, I brought up the offending file with docker-compose up. The deploy key is ignored - even though the engine is in a single node swarm, it was deployed directly to this node. It shows up in the docker ps.

root@maersk:~/swarm_tests/deploykey# docker ps | grep redis
950418c5f3f3        redis:alpine                                        "docker-entrypoint.s…"   54 seconds ago      Up 36 seconds                    6379/tcp                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             deploykey_redis_1

If I use docker stack deploy on a single node swarm, the container still shows up, and the deploy key is obeyed:

root@maersk:~/swarm_tests/deploykey# docker ps | grep redis
dc86eff6dbd8        redis:alpine                                        "docker-entrypoint.s…"   7 seconds ago       Up 6 seconds                6379/tcp                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             up_redis.1.xqmdzddrdcmnojcob47ke0hoh

However, if this is not a single node docker swarm, the situation is different. If the swarm has multiple nodes, swarm will publish the service to every node, but will provide the service as the deploy key recommends. The default is one copy of the container with no node preference or replication. If the container goes down or the node hosting the container goes down, it'll start on another node if possible.

konstruktoid commented 5 years ago

Yeah, but another issue that arises when you have to use docker service is that none of the containers are shown so basically every test and container function has to be re-written.

shawngmc commented 5 years ago

Agreed, that's a bit annoying. I think there are a couple ways to tackle this.

1) Focus on making sure that the tests work for all containers on the current docker host - whether it's normal, single-node swarm or multi-node swarm - then simply document that it needs run on every host in the swarm. You could even provide a docker compose file to help with that via the global deploy mode:

version: "3.7"
services:
  scanner:
    image: dockersamples/examplevotingapp_worker
    deploy:
      mode: global

2) Focus on revamping container discovery in docker-bench-security.sh, then farm out per-container tasks on other nodes via a container deployed with env vars to control the behavior.

I personally like option 1, because a multi-node cluster owner should be running docker bench on every node anyway.

konstruktoid commented 5 years ago

I guess we have to check if there's a service running and then flag various test, just like we do in the swarm section, but then we might miss local containers.

docker / docker-bench-security

Version 3 restart policy #355