moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.85k stars 18.67k forks source link

docker containers experience packet loss or slow network (EAI_AGAIN in application) #12603

Open mlehner616 opened 9 years ago

mlehner616 commented 9 years ago

I've had a similar issue to #12364 and/or #11407 where my production node.js apps inside the container simply stopped responding. I thought the circumstances however were different enough to warrant a separate issue.

Once a container stops responding they will not respond again until the docker daemon is restarted. Not all containers will stop responding at the same time. I'm led to the conclusion that this is a docker issue rather than node because I have several different services running on the same server and all of them experience apparent heavy packet loss hours before the error below occurs which eventually seems to crash node in turn. Restarting the docker daemon cleared up the error and also the packet loss. The exception caught in the app is interesting in that is a seldom occurring EAI_AGAIN error (DNS temporary failure) (Also, not temporary in this case) which led me to believe it could be related to #12364 and/or #11407.

Errors I am seeing in the node app

2015-04-21T08:26:21.415Z - info: --> method:[GET] url:[/status] status:[200] time:[1ms]
events.js:85
      throw er; // Unhandled 'error' event
            ^
Error: getaddrinfo EAI_AGAIN
    at Object.exports._errnoException (util.js:742:11)
    at errnoException (dns.js:46:15)
    at Object.onlookup [as oncomplete] (dns.js:91:26)

Details

[root@server1 root]# docker --version
Docker version 1.5.0, build a8a31ef/1.5.0

[root@server1 root]# uname -a
Linux server1 3.14.20-20.44.amzn1.x86_64 #1 SMP Mon Oct 6 22:52:46 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Thank you

mavenugo commented 9 years ago

@mlehner616 Just commented : https://github.com/docker/docker/issues/11407#issuecomment-95044564 Can you please check if this is applicable ?

mlehner616 commented 9 years ago

@mavenugo Thanks for the suggestion. I checked and it looks like the Amazon Linux AMI has LRO disabled by default so it doesn't look like this would apply in this case.

[root@server1 root]# ethtool -k eth0
Features for eth0:
...
large-receive-offload: off [fixed]
...
SaschaMoellering commented 9 years ago

It seems that we ran into a similar issue, we have a high traffic, event driven application (millions of req/day) using Vert.x and the application suddenly stops working, LRO looks good for us as well:

ethtool -k eth0 | grep large-receive-offload:
large-receive-offload: off

docker --version
Docker version 1.5.0, build a8a31ef

uname -a
Linux 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt7-1~bpo70+1 (2015-04-07) x86_64 GNU/Linux
SaschaMoellering commented 9 years ago

Any updates on this issue?

brudrafon commented 9 years ago

Looks related to this comment https://github.com/docker/docker/issues/11407#issuecomment-97746220

mlehner616 commented 9 years ago

@brudrafon Thanks for the info.

The symptoms here are very subtly different than the article you posted. The article describes the symptoms as the host losing connectivity. In this case, the docker containers are the instances that lose connectivity, not the host.

For example, I have a single host running 5 containers container a on port 3000 container b on port 3001 etc...

The symptoms are any one random container is suddenly unable to make outgoing calls in a timely fashion. The container is receiving requests but is not able to respond because it is unable to communicate with dependent services (e.g. DNS, DB, etc.) DNS being the most obvious example. Some, none, or all containers may or may not fail at the same time. All while this is happening, connectivity to/from the host remains unaffected. Restarting the docker daemon works around the issue temporarily but only delays the inevitable.

I will give the info in the article a try though but even if it helps it is still just a workaround. Docker really must either fix this in their project or champion the upstream fix in the bridge driver.

mlehner616 commented 9 years ago

More info: docker 1.6.0

It's not DNS like some other issues were referencing, because the issue is also apparent when curling an IP directly. Info below.

This is successful on the host which has one container:

[root@ip-1-2-3-4 ~]# curl 54.243.77.158
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.4.6 (Ubuntu)</center>
</body>
</html>

This is run inside the container running on the same host:

[ root@16f7c3e2daf8:/data ]$ curl 54.243.77.158
curl: (7) Failed to connect to 54.243.77.158 port 80: Connection timed out
mlehner616 commented 9 years ago

If I restart the docker daemon it will work, for a while.

[root@ip-1-2-3-4 ~]# service docker restart

cpuguy83 commented 9 years ago

@mlehner616 Is this still an issue on 1.7.1 (or master)

cpuguy83 commented 9 years ago

Also ping @mavenugo @mrjana

dverbeek84 commented 9 years ago

I have the same problem. Both on CentOS 6.6 and 7 with docker version 1.5 and 1.6.2. these servers are running on VMware.

nielskrijger commented 9 years ago

Experiencing the same problem using boot2docker on Mac OS. I'm hammering a small batch script in Node.js 12.7 connecting to a MongoDB 3 server which goes well for a couple of minutes until the inevitable:

Unhandled rejection Error: getaddrinfo EAI_AGAIN
    at Object.exports._errnoException (util.js:746:11)
    at errnoException (dns.js:49:15)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:94:26)

When this happens no connections come through anymore and I have to reboot boot2docker to get it working again.

@cpuguy83 this is using docker 1.7.1

OS: Alpine Linux 3.2.2

dverbeek84 commented 9 years ago

I have followed #11407 but this did't not help.

thaJeztah commented 8 years ago

@mlehner616 are you still having this issue on the current release (1.9.1 / 1.10.0)?

mlehner616 commented 8 years ago

@thaJeztah I am still experiencing this issue with 1.9.1. Don't have my hands on 1.10.0 yet.

The issue seems to have surfaced as an issue in the iptables-based bridging. Something happens where some containers simply stop sending valid outgoing traffic. tcpdump didn't really offer conclusive diagnostics. One work-around I found was to run iptables-save, restart iptables (flush), then iptables-restore. The networking to the container resumes without restarting the container instance.

I have not tried the dummy mac address workaround so I can't confirm that works. This is difficult to troubleshoot because as soon as we poke and prod, the services revive. Here's what's brought it back to life.

I think the common thing between these is that docker reconfigures the iptables in each case. If it's not an docker issue directly (iptables) I believe this is a docker networking design issue and should be looked into at the very least.

thaJeztah commented 8 years ago

@mlehner616 thanks for the info; wondering if it's a docker issue then?

kshcherban commented 8 years ago

I have faced similar issue with apt-get Temporary failure resolving error that is caused by EAI_AGAIN getaddrinfo() error. I have build system (OBS) that spawns lxc containers and installs packages inside them using apt-get (i know that it's a bad idea and builds are not reproducible). From time to time apt-get exits with error and package build fails. Seems that this name resolution issue is caused by some kernel heisenbug.

nunorafaelrocha commented 8 years ago

I'm having an issue on Docker App for Mac Version 1.11.0-beta6 that I don't know if is related is this one.

ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).
thaJeztah commented 8 years ago

@nunorafaelrocha no, that's probably not related, that's a docker-compose error, but if you keep seeing that, possibly report it on the Docker for Mac beta forum

nuzz commented 8 years ago

@nunorafaelrocha I'm seeing this as well, very similar setup (mac beta 1.11.0-beta7) mongo:3.0.7

thaJeztah commented 8 years ago

@nuzz can you report it on the Docker Mac "beta" forum as well? (Or see if there's a similar topic already)

angeloluciani commented 8 years ago

Thank you guys for share , I had the same problem and it is solved using service docker restart

karelstriegel commented 8 years ago

@angeloluciani that is more a workaround then a solution...

randunel commented 8 years ago

I created a test application which resolves asdqq.100.resolve.me down to asdqq.1.resolve.me. I will restrict my logging to the beginning of the output (first 10 resolves), no point in showing all the errors.

During the tests, all the other containers on the same docker host were behaving normally. Only a limited number of containers are affected at one time.

Here is a tcpdump output from inside the container when EAI_AGAIN happens:

 $ tcpdump -i eth0 -n -l -vvvvv | grep asdqq

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0xce36!] 26742+ A? asdqq.99.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 26742 q: A? asdqq.99.resolve.me. 1/0/0 asdqq.99.resolve.me. [9m59s] A 72.52.4.119 (53)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0xffff!] 7085+ AAAA? asdqq.99.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 7085 q: AAAA? asdqq.99.resolve.me. 0/1/0 ns: asdqq.99.resolve.me. [29m59s] SOA ns1.sedoparking.com. hostmaster.sedo.de. 2015071001 86400 10800 604800 86400 (110)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0x3ff6!] 63926+ A? asdqq.96.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 63926 q: A? asdqq.96.resolve.me. 1/0/0 asdqq.96.resolve.me. [9m59s] A 72.52.4.119 (53)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0xdbd4!] 24280+ A? asdqq.95.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 24280 q: A? asdqq.95.resolve.me. 1/0/0 asdqq.95.resolve.me. [9m59s] A 72.52.4.119 (53)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0x0dd4!] 4313+ AAAA? asdqq.96.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 4313 q: AAAA? asdqq.96.resolve.me. 0/1/0 ns: asdqq.96.resolve.me. [29m59s] SOA ns1.sedoparking.com. hostmaster.sedo.de. 2015071001 86400 10800 604800 86400 (110)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0xe7d8!] 21972+ A? asdqq.92.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 21972 q: A? asdqq.92.resolve.me. 1/0/0 asdqq.92.resolve.me. [9m59s] A 72.52.4.119 (53)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0x8510!] 39580+ AAAA? asdqq.95.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 39580 q: AAAA? asdqq.95.resolve.me. 0/1/0 ns: asdqq.95.resolve.me. [29m59s] SOA ns1.sedoparking.com. hostmaster.sedo.de. 2015071001 86400 10800 604800 86400 (110)
    172.18.0.4.48992 > 8.8.8.8.53: [bad udp cksum 0xbc64 -> 0xc4f6!] 31158+ A? asdqq.91.resolve.me. (37)
    8.8.8.8.53 > 172.18.0.4.48992: [udp sum ok] 31158 q: A? asdqq.91.resolve.me. 1/0/0 asdqq.91.resolve.me. [9m59s] A 72.52.4.119 (53)

The following errors were caught inside the container during testing:

asdqq.98.resolve.me { Error: getaddrinfo EAI_AGAIN
    at Object.exports._errnoException (util.js:1007:11)
    at errnoException (dns.js:33:15)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:79:26)
  code: 'EAI_AGAIN',
  errno: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'asdqq.98.resolve.me' }
asdqq.97.resolve.me { Error: getaddrinfo EAI_AGAIN
    at Object.exports._errnoException (util.js:1007:11)
    at errnoException (dns.js:33:15)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:79:26)
  code: 'EAI_AGAIN',
  errno: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'asdqq.97.resolve.me' }
asdqq.100.resolve.me { Error: getaddrinfo EAI_AGAIN
    at Object.exports._errnoException (util.js:1007:11)
    at errnoException (dns.js:33:15)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:79:26)
  code: 'EAI_AGAIN',
  errno: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'asdqq.100.resolve.me' }

100, 97 and 98 threw EAI_AGAIN, but they weren't in tcpdump's output. I waited 10 seconds after my test ran, they never showed up.

It would appear that the failed dns requests never left the container, or left in a malformed manner (not caught by grep asdqq).

iptables-save and iptables-restore does not fix the issue. Docker restart does.

Here is some relevant docker host info (observe bridge slave mac address identical to container link mac address):

root@docker:~# docker --version
Docker version 1.11.2, build b9f10c9
root@docker:~# docker info
Containers: 7
 Running: 7
 Paused: 0
 Stopped: 0
Images: 45
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 51
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-85-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 490 MiB
Name: REDACTED
ID: SEJG:AFW6:OHR3:P6WS:AWTA:PUQX:2TH7:YM7L:3RHJ:RXT5:AL4D:TSES
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=digitalocean
root@docker:~# brctl show br-680021bded96
bridge name bridge id       STP enabled interfaces
br-680021bded96     8000.02427ba31e6a   no      veth0d3850d
                            veth2da3b73
                            veth86d2db4
                            vetha55f0cb
                            vethc5716ba
                            vethd17f0c1
                            vethefe566e
root@docker:~# brctl showstp br-680021bded96
br-680021bded96
 bridge id      8000.02427ba31e6a
 designated root    8000.02427ba31e6a
 root port         0            path cost          0
 max age          20.00         bridge max age        20.00
 hello time        2.00         bridge hello time      2.00
 forward delay        15.00         bridge forward delay      15.00
 ageing time         300.00
 hello timer           1.82         tcn timer          0.00
 topology change timer     0.00         gc timer          94.56
 flags          

veth0d3850d (1)
 port id        8001            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8001            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

veth2da3b73 (4)
 port id        8004            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8004            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

veth86d2db4 (6)
 port id        8006            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8006            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

vetha55f0cb (5)
 port id        8005            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8005            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

vethc5716ba (3)
 port id        8003            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8003            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

vethd17f0c1 (2)
 port id        8002            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8002            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

vethefe566e (7)
 port id        8007            state            forwarding
 designated root    8000.02427ba31e6a   path cost          2
 designated bridge  8000.02427ba31e6a   message age timer      0.00
 designated port    8007            forward delay timer    0.00
 designated cost       0            hold timer         0.82
 flags          

root@docker:~# brctl showmacs br-680021bded96
port no mac addr        is local?   ageing timer
  1 02:42:ac:12:00:02   no         0.01
  3 02:42:ac:12:00:04   no         0.00
  7 02:42:ac:12:00:08   no         0.00
  7 1a:d8:ad:bd:bc:d4   yes        0.00
  3 1e:6c:d2:44:7e:99   yes        0.00
  2 42:dc:88:e5:fe:f4   yes        0.00
  1 7a:18:fa:69:55:4f   yes        0.00
  5 86:ed:12:0a:e8:fd   yes        0.00
  4 c2:33:ca:e3:00:d4   yes        0.00
  6 d6:4e:91:d7:e8:a1   yes        0.00

And mac address of container's eth0:

/src # ip link ls
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
306: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:ac:12:00:04 brd ff:ff:ff:ff:ff:ff

Since this issue is tagged status/more-info-needed, what extra info do you need? I have a relatively clean sandbox playground which reproduces this, but all of our docker hosts are affected.

justincormack commented 8 years ago

@randunel thanks, I will remove that tag, your data seems helpful.

randunel commented 8 years ago

I have something else to add, adding as a separate message because i think it's pretty relevant:

During EAI_AGAIN, all dig and drill requests work fine, instantly. The problem appears to be exclusive to getaddrinfo, no packets are sent when it is called.

If someone else wants to confirm, here's a simple way to test in bash:

for letter in a b c d e f g h i j k l m n o p q r s t u v w x y z; do drill asdqq.7$letter.resolve.me; done;

Run these in parallel to getaddrinfo, and you will see that most getaddrinfo calls fail, while none of the drill requests fail.


Later addition (edit):

I've eliminated node.js, this issue is reproducible with getent hosts <address>.

During node.js's EAI_AGAIN period, run getent in bash and it just hangs. You may test it yourselves:

for i in $(seq 1 100); do getent hosts asdqq.$i.resolve.me; done


drill commands work when specifying @8.8.8.8 as server, but they fail with the default nameserver 127.0.0.11 and nothing is caught by tcpdump.

/src # drill asdqq.google.co.uk @127.0.0.11
Error: error sending query: Could not send or receive, because of network error
/src # drill asdqq.google.co.uk @8.8.8.8
;; ->>HEADER<<- opcode: QUERY, rcode: NXDOMAIN, id: 32851
[...]

strace when failing (compare with successful)

/src # strace getent hosts asdqq.100.resolve.me
execve("/usr/bin/getent", ["getent", "hosts", "asdqq.100.resolve.me"], [/* 9 vars */]) = 0
arch_prctl(ARCH_SET_FS, 0x7fa57a7dfc28) = 0
set_tid_address(0x7fa57a7dfc60)         = 1451
mprotect(0x7fa57a7dc000, 4096, PROT_READ) = 0
mprotect(0x7fa57a9e3000, 4096, PROT_READ) = 0
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
readv(3, [{"", 0}, {"127.0.0.1\tlocalhost\n::1\tlocalhos"..., 1024}], 2) = 174
readv(3, [{"", 0}, {"127.0.0.1\tlocalhost\n::1\tlocalhos"..., 1024}], 2) = 0
close(3)                                = 0
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
readv(3, [{"", 0}, {"nameserver 127.0.0.11\noptions nd"..., 248}], 2) = 38
readv(3, [{"", 0}, {"nameserver 127.0.0.11\noptions nd"..., 248}], 2) = 0
close(3)                                = 0
clock_gettime(CLOCK_REALTIME, {1466082393, 277229293}) = 0
socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
clock_gettime(CLOCK_REALTIME, {1466082393, 278545491}) = 0
sendto(3, "?s\1\0\0\1\0\0\0\0\0\0\5asdqq\003100\7resolve\2m"..., 38, MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.11")}, 16) = 38
poll([{fd=3, events=POLLIN}], 1, 2500)  = 0 (Timeout)
clock_gettime(CLOCK_REALTIME, {1466082395, 783577998}) = 0
sendto(3, "?s\1\0\0\1\0\0\0\0\0\0\5asdqq\003100\7resolve\2m"..., 38, MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.11")}, 16) = 38
poll([{fd=3, events=POLLIN}], 1, 2500)  = 0 (Timeout)
clock_gettime(CLOCK_REALTIME, {1466082398, 288665114}) = 0
close(3)                                = 0
exit_group(2)                           = ?
+++ exited with 2 +++

You can see sendto with the appropritate fd and address, but nothing showed up in tcpdump with | grep asdqq.


I did a kernel trace (ftrace) of getent. After every getent retry of sys_enter_sendto, the kernel wakes up docker, which starts with sys_enter_recvmsg. Ultimately, docker does sys_enter_epoll_wait, yields to the scheduler and init does some journaling.

The ftrace shows that the file descriptor that getent sends data to is open in docker, and docker receives some buffer every time getent retries. The ftrace also shows that docker never sends any data back.

I have 2 separate traces showing what I've described above, each almost 100mb in size. Since the ftrace contains all the processes and their actions, I won't be publicly sharing the trace data, but I can make them available on request.


strace of docker when running getent inside its container:

root@docker:~# strace -p 4911 -f 2>&1 | grep asdqq -B 10 -A 10
[pid  4916] select(0, NULL, NULL, NULL, {0, 20}) = 0 (Timeout)
[pid  4916] clock_gettime(CLOCK_MONOTONIC, {116128, 619709506}) = 0
[pid  4916] futex(0x21e9760, FUTEX_WAIT, 0, {60, 0} <unfinished ...>
[pid  5933] <... epoll_wait resumed> {{EPOLLIN|EPOLLOUT, {u32=142472192, u64=140209349850112}}}, 128, -1) = 1
[pid  5933] clock_gettime(CLOCK_MONOTONIC, {116128, 897768863}) = 0
[pid  5933] futex(0x21e9760, FUTEX_WAKE, 1) = 1
[pid  5933] read(14, "\27\3\3\0\31\0\0\0\0\0\0\2[lT=\201:\3750\321\5>\232&&6\360\206\272", 1024) = 30
[pid  5933] write(15, "\r", 1)          = 1
[pid  5933] read(14, 0xc8208a9800, 1024) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5933] epoll_wait(5, {{EPOLLIN|EPOLLOUT, {u32=403556904, u64=140209610934824}}}, 128, 0) = 1
[pid  5933] recvmsg(44, {msg_name(16)={sa_family=AF_INET, sin_port=htons(50458), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)=[{"v\27\1\0\0\1\0\0\0\0\0\0\5asdqq\003100\7resolve\2m"..., 512}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=, ...}, msg_flags=0}, 0) = 38
[pid  5933] clock_gettime(CLOCK_REALTIME, {1466098812, 907737107}) = 0
[pid  5933] clock_gettime(CLOCK_MONOTONIC, {116128, 900787608}) = 0
[pid  5933] clock_gettime(CLOCK_REALTIME, {1466098812, 907880168}) = 0
[pid  5933] clock_gettime(CLOCK_MONOTONIC, {116128, 900926333}) = 0
[pid  5933] recvmsg(44, 0xc8214a5a38, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5933] clock_gettime(CLOCK_REALTIME, {1466098812, 908195315}) = 0
[pid  5933] clock_gettime(CLOCK_MONOTONIC, {116128, 901245483}) = 0
[pid  5933] clock_gettime(CLOCK_REALTIME, {1466098812, 908337251}) = 0
[pid  5933] clock_gettime(CLOCK_MONOTONIC, {116128, 901407503}) = 0
[pid  5933] clock_gettime(CLOCK_REALTIME, {1466098812, 908511192}) = 0

and another part of the same strace, later on

[pid  4916] select(0, NULL, NULL, NULL, {0, 20}) = 0 (Timeout)
[pid  4916] clock_gettime(CLOCK_MONOTONIC, {116252, 649657782}) = 0
[pid  4916] futex(0x21e9760, FUTEX_WAIT, 0, {60, 0} <unfinished ...>
[pid  5756] <... epoll_wait resumed> {{EPOLLIN|EPOLLOUT, {u32=142472192, u64=140209349850112}}}, 128, -1) = 1
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 834795131}) = 0
[pid  5756] futex(0x21e9760, FUTEX_WAKE, 1) = 1
[pid  5756] read(14, "\27\3\3\0\31\0\0\0\0\0\0\2n@\306\321\304\246R\321\f%R\214<\332w\272\365B", 1024) = 30
[pid  5756] write(15, "\r", 1)          = 1
[pid  5756] read(14, 0xc8208a9800, 1024) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5756] epoll_wait(5, {{EPOLLIN|EPOLLOUT, {u32=403556904, u64=140209610934824}}}, 128, 0) = 1
[pid  5756] recvmsg(44, {msg_name(16)={sa_family=AF_INET, sin_port=htons(43159), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)=[{"y\232\1\0\0\1\0\0\0\0\0\0\5asdqq\003100\7resolve\2m"..., 512}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=, ...}, msg_flags=0}, 0) = 38
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 844718936}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 837767716}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 844859224}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 837905091}) = 0
[pid  5756] recvmsg(44, 0xc8214a5a38, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 845188131}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 838260552}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 845355591}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 838402984}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 845504534}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 845601952}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 838657414}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 845749048}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 838791434}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 845887711}) = 0
[pid  5756] epoll_wait(5, {}, 128, 0)   = 0
[pid  5756] epoll_wait(5,  <unfinished ...>
[pid  5729] <... read resumed> "\r\n", 32768) = 2
[pid  5729] write(14, "\27\3\3\0\32\0\0\0\0\0\0\10\204\241\330\264_\254\357\0*z\325\376\312/\303Q\330\nr", 31) = 31
[pid  5729] read(17,  <unfinished ...>
[pid  4916] <... futex resumed> )       = 0
[pid  4916] clock_gettime(CLOCK_MONOTONIC, {116252, 839731980}) = 0
[pid  4916] clock_gettime(CLOCK_REALTIME, {1466098936, 846850404}) = 0
[pid  4916] select(0, NULL, NULL, NULL, {0, 20}) = 0 (Timeout)
[pid  4916] clock_gettime(CLOCK_MONOTONIC, {116252, 840098375}) = 0
[pid  4916] clock_gettime(CLOCK_REALTIME, {1466098936, 847191159}) = 0
[pid  4916] futex(0xc8213cb310, FUTEX_WAKE, 1 <unfinished ...>
[pid 13039] <... futex resumed> )       = 0
[pid 13039] futex(0xc8213cb310, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid  4916] <... futex resumed> )       = 1
[pid  4916] select(0, NULL, NULL, NULL, {0, 20}) = 0 (Timeout)
[pid  4916] clock_gettime(CLOCK_MONOTONIC, {116252, 840628206}) = 0
[pid  4916] futex(0x21e9760, FUTEX_WAIT, 0, {60, 0} <unfinished ...>
[pid  5756] <... epoll_wait resumed> {{EPOLLIN|EPOLLOUT, {u32=403556712, u64=140209610934632}}}, 128, -1) = 1
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 960717109}) = 0
[pid  5756] futex(0x21e9760, FUTEX_WAKE, 1) = 1
[pid  5756] recvmsg(40, {msg_name(16)={sa_family=AF_INET, sin_port=htons(57909), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)=[{"-#\1\0\0\1\0\0\0\0\0\0\rproxy_default\0\0\1\0\1", 512}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=, ...}, msg_flags=0}, 0) = 31
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 968223500}) = 0
[pid  5756] clock_gettime(CLOCK_MONOTONIC, {116252, 961348046}) = 0
[pid  5756] clock_gettime(CLOCK_REALTIME, {1466098936, 968515093}) = 0
randunel commented 8 years ago

As you can see (in my edited messages above), I can do both user space and kernel space traces in a relatively controlled environment. I've done traces on both alpine linux 3.4 and ubuntu 16.04, the behaviour is the same.

I can do more tracing and debugging if you need me to. Right now I think I've reached the limit of my knowledge, so I'll need instructions or assistance for anything more complicated than ftrace/strace/tcpdump.

randunel commented 8 years ago

Docker's libnetwork has a hard limit of 100 concurrent dns requests.

It would appear that this is the cause of EAI_AGAIN, docker logs even show More than 100 concurrent queries from [...].

While this limit is extremely low and non-configurable, network congestion also affects how quickly that limit is reached.

It looks like we're reaching the 100-concurrent-dns-requests-limit as a consequence of slow network (100+ dns requests which take >1 second to complete) and that is cause of the errors we're seeing. I don't yet know the cause of network slowness, probably just above average usage and #7857.

Related issue: #22185

sanimej commented 8 years ago

@randunel The limit of 100 concurrent queries was added in 1.11 release. In your test its possible you are hitting it when the DNS resolutions are slow. But first reporters of this EAI_AGAIN issue were using pre 1.11 releases. So I am not sure if the rate limiting of concurrent queries is the main root cause here.

blacktop commented 8 years ago

I have a networking noob question. Can I capture traffic inside docker as fast as outside of docker?

docker run --net=host tcpdump

Should I expect dropped packets and if so is there anything I can do to not? I have an application that needs to capture as fast as possible.

Thank you.

thaJeztah commented 8 years ago

@blacktop please do not ask unrelated questions on the issue tracker; for questions/discussions about using docker, try forums.docker.com, or the #docker IRC channel. The GitHub issue tracker is for tracking bugs.

blacktop commented 8 years ago

@thaJeztah I apologize. 😞

Serfenia commented 8 years ago

What's the current status of this issue and are there any possibilities in fixing it?

From my own experience I started seeing these errors when I had more then 10 docker containers up and running for a while. At the moment I am seeing these messages daily and the bigger my microservice network grows, the more errors of this kind I am seeing. (Docker 1.10.3 build 20f81dd)

randunel commented 8 years ago

We have upgraded to 1.12.0 and still experiencing these issues.

As a workaround, we have routed all of our external connections through proxies outside the swarm, ip-based connection.

We haven't been able to use anything that requires DNS resolution ever since we've switched our infrastructure to docker.

ozlatkov commented 7 years ago

Great troubleshooting, @randunel . Thanks a lot - helped me figure out the situation as we're also experiencing absolutely the same situation with the very same straces/tcpdump captures as shown above.

Anyone heard about any plans for having this addressed? Strange thing is that there are no updates from August,2016 considering that DNS resolution is so unreliable that it cannot be trusted at all.

sanimej commented 7 years ago

@randunel In the latest docker master I am not able to recreate this issue.. From a ubuntu container on a bridge network for i in $(seq 1 100); do getent hosts asdqq.$i.resolve.me; done works just fine. I tried few different public DNS servers.

Are you seeing the problem only for these specific names or any external query returns the same error ?

pleerock commented 7 years ago

Suddenly I had same problem with node and redis, restarting docker itself helped me (OS: centos).

ParimiDev commented 7 years ago

had same problem with docker 1.11.2 and only workaround was to reboot the node docker service did not make any difference.

with the upgrade to 1.12.4 reduced the frequency however it still happens frequent enough not able to use it in real prod. with the upgrade there is inconsistency with the workaround some times docker restart helps sometimes it doesn't

this issues is not seen in net=host mode though.

HudsonAkridge commented 7 years ago

We're also experiencing a very similar issue between the bridge network. Several large requests will go through quickly, then the next one or two requests will hang for a long while trying to establish a connection to the other docker container, then my suspicion is something in docker is restarted or refreshed, and we get 5 or 6 more large requests through quickly, then another hang until the same restart point is reached.

Very cumbersome to deal with, and not something that restarting the docker daemon is unacceptable for our situation to solve.

westsouthnight commented 7 years ago

+1 @HudsonAkridge have same network issues if send/receive a lot of packets. We always get it in unexpected places when run the tests.

Dear @aboch and @thaJeztah, please if you can join to conversation. It very bad bug, and we sense that every day.

All hosts have latest stable Docker version, and have same issues.

That hosts, are a standalone sandbox servers, for test, without swarm and other connections.

We start tests from Windows host where placed that Docker to application inside docker.

docker info

Containers: 12
 Running: 0
 Paused: 0
 Stopped: 12
Images: 1803
Server Version: 17.03.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 1408
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.12-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 11.2 GiB
Name: moby
ID: 7RMP:T34G:KAEO:WHV5:3XDD:352G:63VB:S3OS:RZUU:U7US:ZI7S:ZJUS
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

docker version


Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 10:40:59 2017
 OS/Arch:      windows/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   3a232c8
 Built:        Tue Feb 28 07:52:04 2017
 OS/Arch:      linux/amd64
 Experimental: false
rbranson commented 7 years ago

FWIW we were able to eliminate DNS issues we had on larger machines that performed a high rate of DNS resolution by running a local cache (dnsmasq). It's been a few months, but I believe this was related to some kind of port exhaustion. These would stack up in the conntrack table and once hit 65Kish things would go south.

ut0mt8 commented 7 years ago

Wow good to see this thread. I have a kubernetes cluster which have the same problem from time to time....

ghost commented 7 years ago

have the same problem on Docker version 17.06.0-ce, build 02c1d87

jcollum-cambia commented 7 years ago

A coworker is seeing this problem, same version

$ docker --version
Docker version 17.06.0-ce, build 02c1d87

However I can build the image fine on my machine. We are both on Win 10, running Docker for Windows, same version.

cristhianbicca commented 7 years ago

I also have the same problem on my servers. Client: Version: 17.06.0-ce API version: 1.30 Go version: go1.8.3 Git commit: 02c1d87 Built: Fri Jun 23 21:23:31 2017 OS/Arch: linux/amd64

Server: Version: 17.06.0-ce API version: 1.30 (minimum version 1.12) Go version: go1.8.3 Git commit: 02c1d87 Built: Fri Jun 23 21:19:04 2017 OS/Arch: linux/amd64 Experimental: false

vania-pooh commented 7 years ago

Not sure guys if this is a good idea, but just noticed that setting docker0 Mac-address to eth0 value like described here dramatically improves network stuff. E.g. having eth0 Link encap:Ethernet HWaddr 00:15:b2:a9:8f:6e I did:

# ip link set docker0 address 00:15:b2:a9:8f:6e

My environment:

root@selenium-cloud-m-6:~# docker version
Client:
 Version:      17.05.0-ce
 API version:  1.29
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:06:06 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.05.0-ce
 API version:  1.29 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   89658be
 Built:        Thu May  4 22:06:06 2017
 OS/Arch:      linux/amd64
 Experimental: false

In my case LRO was turned off. Hope it helps someone.

eduardobcastro commented 4 years ago

I'm experiencing about 30% retransmission between containers (I'm running 20 containers in the same network) image Docker version 19.03.13, build 4484c46d9d

abiralneupane2 commented 1 year ago

Docker version 20.10.22(windows) I am trying to use docker as alternative for VMware and I am experiencing similar issue:

 $ docker run -i -t -p 8082:80 ubuntu /bin/bash
 $ apt update

Takes too long saying [Waiting for headers]. Also when I try to ping archive.ubuntu.com or any other domain I get either 100% packet loss or 0%. Tried two fixes with no improvement

  1. set the mtu size 1400 for the container's network.
  2. disable the ICS tool from windows services.