docker pull issue no such host

[x] This is a bug report
[ ] This is a feature request
[ x] I searched existing issues before opening this one

Expected behavior

5.0.25_61: Pulling from rrg Digest: sha256:50bbce4af6749e9a976f0533c3b50a0badb54855b73d8a3743473f1487fd223e Status: Downloaded newer image forXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/rrg:5.0.25_61

Actual behavior

docker-compose up -d rrg-node-1 Creating rrg-node-1

ERROR: for rrg-node-1 Cannot create container for service rrg-node-1: Error response from daemon: Get https:/XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/v2/: dial tcp: lookup XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com on 10.5.0.2:53: no such host

Steps to reproduce the behavior

docker pull XXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/rrg:5.0.25_61

Output of docker version:

(Docker version 18.03.1-ce, build 3dfb8343b139d6342acfd9975d7f1068b5b1c3d3)

Output of docker info:

([ec2-user@ip-10-5-3-45 ~]$ docker info
Containers: 37
 Running: 36
 Paused: 0
 Stopped: 1
Images: 60
Server Version: swarm/1.2.5
Role: replica
Primary: 10.5.4.172:3375
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 12

Plugins:
 Volume:
 Network:
 Log:
Swarm:
 NodeID:
 Is Manager: false
 Node Address:
Kernel Version: 4.14.51-60.38.amzn1.x86_64
Operating System: linux
Architecture: amd64
CPUs: 22
Total Memory: 80.85GiB
Name: mgr1
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
Experimental: false
Live Restore Enabled: false

WARNING: No kernel memory limit support)

Additional environment details (AWS, VirtualBox, physical, etc.) Hi all,

Has anyone facing this issue with docker pull. we recently upgraded docker to 18.03.1-ce from then we are seeing the issue. Although we are not exactly sure if this is related to docker, but just checking if anyone facing the same issue.

We have done some troubleshooting using tcp dump the DNS queries being made were under the permissible limit of 1024 packet. which is a limit on EC2, We also tried working around the issue by modifying the /etc/resolv.conf file to use a higher retry \ timeout value, but that didn't seem to help.

we did a packet capture line by line and found something. we found some responses to be negative. If you use Wireshark, you can use 'udp.stream eq 12' as a filter to view one of the negative answers. we can see the resolver sending an answer "No such name". All these requests that get a negative response use the following name in the request:

354XXXXX.dkr.ecr.us-east-1.amazonaws.com.ec2.internal

Would you happen to know why ec2.internal is being added to the name that you're resolving? If run a dig against this name it fails. So it appears that a wrong name is being sent to the server which responds with 'no such host'.

We are seeing this issue happening intermittently. Please help.

docker / for-linux