If the host system is configured to use the systemd DNS stub resolver, then an ECS task using the awsvpc network mode will fail to mount an EFS volume.
When the systemd DNS stub resolver is enabled, the resolver configuration file will specify 127.0.0.53 as the name server (by symlinking to /run/systemd/resolve/stub-resolv.conf):
Note: On Amazon Linux 2022 upgrading the systemd-resolved package will enable the DNS stub resolver even if it was previously disabled.
ECS containers using the awsvpc network mode are isolated from the host by a network namespace and are therefore not able to use 127.0.0.53 as a name server. Docker detects this condition and configures the containers to use the VPC name server configured in /run/systemd/resolve/resolv.conf.
$ journalctl -u docker
Sep 29 06:54:47 ip-10-2-128-71.us-west-2.compute.internal dockerd[3402328]: time="2022-09-29T06:54:47.583869871Z" level=info msg="detected 127.0.0.53 nameserver, assuming systemd-resolved, so using resolv.conf: /run/systemd/resolve/resolv.conf"
When the amazon-ecs-volume-plugin service through the mount.efs script attempts to mount an EFS volume for an ECS container using the awsvpc network mode it will use nsenter to invoke the stunnel and mount.nfs4 commands in the same network namespace as the container.
$ cat /var/log/amazon/efs/mount.log
2022-09-29 07:00:54 UTC - INFO - version=1.33.2 options={'rw': None, 'tls': None, 'netns': '/proc/3404229/ns/net'}
2022-09-29 07:00:54 UTC - INFO - binding 20240
2022-09-29 07:00:54 UTC - WARNING - stunnel does not support "b'libwrap'"
2022-09-29 07:00:54 UTC - INFO - Starting TLS tunnel: "nsenter --net=/proc/3404229/ns/net /usr/bin/stunnel /var/run/efs/stunnel-config.fs-[efs_id].var.lib.ecs.volumes.ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601.20240"
2022-09-29 07:00:54 UTC - INFO - Started TLS tunnel, pid: 3404373
2022-09-29 07:00:54 UTC - INFO - Executing: "nsenter --net=/proc/3404229/ns/net /sbin/mount.nfs4 127.0.0.1:/ /var/lib/ecs/volumes/ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601 -o rw,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,port=20240" with 15 sec time limit.
2022-09-29 07:01:09 UTC - ERROR - Mounting fs-[efs_id].efs.us-west-2.amazonaws.com to /var/lib/ecs/volumes/ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601 failed due to timeout after 15 sec, mount attempt 1/3, wait 0 sec before next attempt.
However, since the amazon-ecs-volume-plugin service and its children run outside Docker they are not subject to the same workaround and will attempt to use 127.0.0.53 as a name server. Ultimately, stunnel will fail to resolve the EFS endpoint because it is trying to use the DNS stub resolver from a network namespace.
$ journalctl -u amazon-ecs-volume-plugin
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal amazon-ecs-volume-plugin[3393399]: 2022/09/29 07:00:54 Entering go-plugins-helpers getPath
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal amazon-ecs-volume-plugin[3393399]: 2022/09/29 07:00:54 Entering go-plugins-helpers createPath
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal amazon-ecs-volume-plugin[3393399]: level=info time=2022-09-29T07:00:54Z msg="Creating new volume ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601"
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal amazon-ecs-volume-plugin[3393399]: level=info time=2022-09-29T07:00:54Z msg="Creating mount target for new volume ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601"
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal amazon-ecs-volume-plugin[3393399]: level=info time=2022-09-29T07:00:54Z msg="Validating create options for volume ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601"
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal amazon-ecs-volume-plugin[3393399]: level=info time=2022-09-29T07:00:54Z msg="Mounting volume ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601 of type efs at path /var/lib/ecs/volumes/ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601"
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: stunnel 5.58 on x86_64-koji-linux-gnu platform
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: Compiled with OpenSSL 3.0.0 7 sep 2021
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: Running with OpenSSL 3.0.3 3 May 2022
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,OCSP,PSK,SNI
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: Reading configuration from file /run/efs/stunnel-config.fs-[efs_id].var.lib.ecs.volumes.ecs-[service_name]-12-[volume_name]-e49ee4deabc3f5939601.20240
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: UTF-8 byte order mark not detected
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: FIPS mode disabled
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[ui]: Configuration successful
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[0]: Service [efs] accepted connection from 127.0.0.1:40200
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG3[0]: Error resolving "fs-[efs_id].efs.us-west-2.amazonaws.com": Neither nodename nor servname known (EAI_NONAME)
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG3[0]: No remote host resolved
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[0]: Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
[..]
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[166]: Service [efs] accepted connection from 127.0.0.1:41742
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG3[166]: Error resolving "fs-[efs_id].efs.us-west-2.amazonaws.com": Neither nodename nor servname known (EAI_NONAME)
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG3[166]: No remote host resolved
Sep 29 07:00:54 ip-10-2-128-71.us-west-2.compute.internal stunnel[3404373]: LOG5[166]: Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
If the host system is configured to use the systemd DNS stub resolver, then an ECS task using the awsvpc network mode will fail to mount an EFS volume.
When the systemd DNS stub resolver is enabled, the resolver configuration file will specify
127.0.0.53
as the name server (by symlinking to/run/systemd/resolve/stub-resolv.conf
):Note: On Amazon Linux 2022 upgrading the
systemd-resolved
package will enable the DNS stub resolver even if it was previously disabled.ECS containers using the awsvpc network mode are isolated from the host by a network namespace and are therefore not able to use
127.0.0.53
as a name server. Docker detects this condition and configures the containers to use the VPC name server configured in/run/systemd/resolve/resolv.conf
.When the
amazon-ecs-volume-plugin
service through themount.efs
script attempts to mount an EFS volume for an ECS container using the awsvpc network mode it will usensenter
to invoke thestunnel
andmount.nfs4
commands in the same network namespace as the container.However, since the
amazon-ecs-volume-plugin
service and its children run outside Docker they are not subject to the same workaround and will attempt to use127.0.0.53
as a name server. Ultimately,stunnel
will fail to resolve the EFS endpoint because it is trying to use the DNS stub resolver from a network namespace.