docker / docker-bench-security

The Docker Bench for Security is a script that checks for dozens of common best-practices around deploying Docker containers in production.
Apache License 2.0
9.12k stars 1.02k forks source link

[false positive] MaximumRetryCount is not set to 5: #521

Closed andreagalle closed 1 year ago

andreagalle commented 1 year ago

I keep getting the false positive below (in section 5.14 of you report) running the v1.5.0 docker-bench-security.sh script on my env

  [WARN] 5.14 - Ensure that the 'on-failure' container restart policy is set to '5' (Automated)
  [WARN]       * MaximumRetryCount is not set to 5: my-stack_working-test.1.hn66bykaqaxp2r0jkxjkb4lyb

even if the MaximumRetryCount is properly configured (or better. the "MaxAttempts" one)

  [root@172 test]# docker service inspect my-stack_working-test | grep -in "MaxAttempts" -B 5 -A 5 --color
  28-                },
  29-                "Resources": {},
  30-                "RestartPolicy": {
  31-                    "Condition": "on-failure",
  32-                    "Delay": 5000000000,
  33:                    "MaxAttempts": 5
  34-                },
  35-                "Placement": {},
  36-                "Networks": [
  37-                    {
  38-                        "Target": "hwiiab56tunn7gtaiur4tmngk",
  [root@172 test]# docker service inspect my-stack_crashing-test | grep -in "MaxAttempts" -B 5 -A 5 --color
  28-                },
  29-                "Resources": {},
  30-                "RestartPolicy": {
  31-                    "Condition": "on-failure",
  32-                    "Delay": 5000000000,
  33:                    "MaxAttempts": 5
  34-                },
  35-                "Placement": {},
  36-                "Networks": [
  37-                    {
  38-                        "Target": "hwiiab56tunn7gtaiur4tmngk",

and working as expected:

  [root@172 test]# docker service ps my-stack_working-test --no-trunc
  ID                          NAME                      IMAGE               NODE            DESIRED STATE   CURRENT STATE           ERROR     PORTS
  hn66bykaqaxp2r0jkxjkb4lyb   my-stack_working-test.1   work-image:latest   ***.**.***.**   Running         Running 9 minutes ago
  [root@172 test]# docker service ps my-stack_crashing-test --no-trunc
  ID                          NAME                           IMAGE                NODE            DESIRED STATE   CURRENT STATE          ERROR                       PORTS
  xjq2q8czw441u5kfhhwg0eybn   my-stack_crashing-test.1       crash-image:latest   ***.**.***.**   Shutdown        Failed 9 minutes ago   "task: non-zero exit (1)"
  2fq1qcratulyzw64jgxqeauz9    \_ my-stack_crashing-test.1   crash-image:latest   ***.**.***.**   Shutdown        Failed 9 minutes ago   "task: non-zero exit (1)"
  74av33ziti9rz58i5ojwbvqr2    \_ my-stack_crashing-test.1   crash-image:latest   ***.**.***.**   Shutdown        Failed 9 minutes ago   "task: non-zero exit (1)"
  rbzswrlkkepm4g7rmb2hjcliv    \_ my-stack_crashing-test.1   crash-image:latest   ***.**.***.**   Shutdown        Failed 9 minutes ago   "task: non-zero exit (1)"
  x28qnn1t4yi72dcyana2zh6ty    \_ my-stack_crashing-test.1   crash-image:latest   ***.**.***.**   Shutdown        Failed 9 minutes ago   "task: non-zero exit (1)"
  a454ffhssmb2hobamq5bdt33e    \_ my-stack_crashing-test.1   crash-image:latest   ***.**.***.**   Shutdown        Failed 9 minutes ago   "task: non-zero exit (1)"

I made couple tests on the following Docker images (below everything to reproduce this issue):

  1. first service (working one) is built from the Dockerfile below:

    FROM alpine:latest
    
    CMD sh -c "ping localhost"
  2. second service (crashing one) is built from the Dockerfile below:

    FROM alpine:latest
    
    CMD sh -c "exit 1"

The two services are deployed, from the following stack-file.yml

  version: "3.9"

  services:
    working-test:
      image: work-image:latest
      deploy:
        restart_policy:
          max_attempts: 5
          condition: on-failure
    crashing-test:
      image: crash-image:latest
      deploy:
        restart_policy:
          max_attempts: 5
          condition: on-failure

to docker swarm, with this command:

  docker stack deploy --compose-file stack-file.yml my-stack

What's wrong with the above configuration, or rather with the docker-bench-security.sh utility, when it checks for the MaximumRetryCount that seems to me properly configured and working as expected? Maybe this is just an issue of misalignment between these two properties?

I am using Docker version 23.0.1:

  [root@172 docker-bench-security-1.5.0]# docker version
  Client: Docker Engine - Community
   Version:           23.0.1
   API version:       1.42
   Go version:        go1.19.5
   Git commit:        a5ee5b1
   Built:             Thu Feb  9 19:51:00 2023
   OS/Arch:           linux/amd64
   Context:           default

  Server: Docker Engine - Community
   Engine:
    Version:          23.0.1
    API version:      1.42 (minimum version 1.12)
    Go version:       go1.19.5
    Git commit:       bc3805a
    Built:            Thu Feb  9 19:48:42 2023
    OS/Arch:          linux/amd64
    Experimental:     false
   containerd:
    Version:          1.6.18
    GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
   runc:
    Version:          1.1.4
    GitCommit:        v1.1.4-0-g5fd4c4d
   docker-init:
    Version:          0.19.0
    GitCommit:        de40ad0
konstruktoid commented 1 year ago

Hi @andreagalle the test is probably just missing the correct value, since services, stacks and swarm hasn't really been inclued.

will have a look

andreagalle commented 1 year ago

I can see from your sources, the check is made on the containers, where in fact the MaximumRetryCount property is 0 in my case too.

  [root@172 test]# docker inspect --format MaximumRetryCount='{{ .HostConfig.RestartPolicy.MaximumRetryCount }}' 27a2ef894e2b
  MaximumRetryCount=0
  [root@172 test]# docker inspect --format MaximumRetryCount='{{ .HostConfig.RestartPolicy.MaximumRetryCount }}' 49d09a6bd0fd
  MaximumRetryCount=0

Is it more correct to set it for containers too, rather than for services (as I am doing right now) or better for you to check both the container and the corresponding service property, replacing the above lines with the following one?

cpolicy=$(docker inspect --format MaximumRetryCount='{{ .HostConfig.RestartPolicy.MaximumRetryCount }}' "$c")
spolicy=$(docker inspect --format MaxAttempts='{{ .Spec.TaskTemplate.RestartPolicy.MaxAttempts }}' "$s")

if [ "$cpolicy" != "MaximumRetryCount=5" ] || [ "$spolicy" != "MaxAttempts"=5" ]; then

where $s is the list of services as well as $c is the one for containers.

konstruktoid commented 1 year ago

yeah, the last example is what i was going for

want to write a PR?

konstruktoid commented 1 year ago

but note that you need to exclude any service containers from $c as well

for id in $(docker inspect $(docker ps -qa) --format '{{ .Id }}'); do if [ "$id" = "$(docker inspect $(docker service ps -q issue521_working-test) --format '{{ .Status.ContainerStatus.ContainerID }}')" ]; then echo "the container is actually a service"; fi ; done

andreagalle commented 1 year ago

yeah, the last example is what i was going for

want to write a PR?

Yes, I will!

Chatting with chatGPT it seems there's no workaround to set the MaximumRetryCount at container level from the stack file.

"The MaximumRetryCount property is a property of the RestartPolicy within the HostConfig of a container. You can set this property when starting a container using the --restart flag with the docker run command. For example, to set the MaximumRetryCount to 5 for a container, you would use the command docker run --restart=on-failure:5 [image_name].

However, when using Docker Compose and swarm mode, you cannot directly set the MaximumRetryCount property for individual containers. Instead, you can set the max_attempts property of the restart_policy within the deploy key in your stack-file.yml, which will apply to all containers created for that service .

In Docker Compose and swarm mode, the restart_policy is specified at the service level within the deploy key in the stack-file.yml. This means that the restart policy applies to all containers created for that service. The max_attempts property of the restart_policy specifies the maximum number of times to restart a container before giving up .

The MaximumRetryCount property of the RestartPolicy within the HostConfig of a container is not directly settable in Docker Compose and swarm mode because it is not part of the service-level configuration. Instead, you can use the max_attempts property of the restart_policy to achieve similar functionality."

All this sounds correct to me. Didn't find any docs talking about how to setup this container property from the stack file.

andreagalle commented 1 year ago

@konstruktoid I made it! #522

but note that you need to exclude any service containers from $c as well

for id in $(docker inspect $(docker ps -qa) --format '{{ .Id }}'); do if [ "$id" = "$(docker inspect $(docker service ps -q issue521_working-test) --format '{{ .Status.ContainerStatus.ContainerID }}')" ]; then echo "the container is actually a service"; fi ; done

Thanks for the clarification on checking whether the container was part of a service or not. I followed another path. However, I think it is worth mentioning that (there are a couple lines of comment there) the container Name is not checked against the service Name only, but the task_id is compared between the two, providing for better consistency. I mean:

# a container name could arbitrary include a service one: it belongs to a service (created by Docker 
# as part of the service), if the container task ID matches one of the task IDs of the service.

Maybe the warn message, could then be more precise:

warn "      * MaximumRetryCount is not set to 5: $c"

telling whether it is the container or the service (or both!) not having the property properly set, but then I though it might be a little out of scope. I was wondering if it would be correct to provide such an information on the 5_container_runtime.sh module.