amyangfei / dynamicmq-go

dynamic attributed-based publish/subscribe messaging system
3 stars 1 forks source link

read latency when network congestion #7

Open amyangfei opened 9 years ago

amyangfei commented 9 years ago

Datanode Send-Q has accumulated a lot of data but dispatcher is receiving data slowly.

dispatcher log:

2015-05-20 23:40:19.629 [INFO] 54f99 [processPushMsg] send 81 direct msg to connector conn0101 latency: 697548.524 
2015-05-20 23:40:19.629 [DEBU] 54f9a [handleTCPConn] before get new bufio reader 
2015-05-20 23:40:19.630 [DEBU] 54f9b [handleTCPConn] before read 
2015-05-20 23:40:19.834 [DEBU] 54f9c [handleTCPConn] end read rlen: 4068 
2015-05-20 23:40:19.834 [INFO] 54f9d [processPushMsg] send 39 direct msg to connector conn0101 latency: 697753.906 

dispatcher strace log (sudo strace -f -v -x -tt -T -p 92324 -o strace_recv.log):

94951 23:40:19.629828 write(3, "2015-05-20 23:40:19.629 [INFO] 5"..., 115) = 115 <0.000038> 
94951 23:40:19.629958 clock_gettime(CLOCK_REALTIME, {1432136419, 629994695}) = 0 <0.000031> 
94951 23:40:19.630103 clock_gettime(CLOCK_REALTIME, {1432136419, 630143536}) = 0 <0.000032> 
94951 23:40:19.630219 write(3, "2015-05-20 23:40:19.629 [DEBU] 5"..., 81) = 81 <0.000036> 
94951 23:40:19.630337 clock_gettime(CLOCK_REALTIME, {1432136419, 630373340}) = 0 <0.000030> 
94951 23:40:19.630479 clock_gettime(CLOCK_REALTIME, {1432136419, 630518521}) = 0 <0.000031> 
94951 23:40:19.630600 write(3, "2015-05-20 23:40:19.630 [DEBU] 5"..., 65) = 65 <0.000036> 
94951 23:40:19.630717 read(9, 0xc20805f01c, 4068) = -1 EAGAIN (Resource temporarily unavailable) <0.000033> 
94951 23:40:19.630833 write(8, "\x01\x03\xf1\x00\x01\x00\x0c\x55\x5c\xa8\x4d\x62\xbb\x97\x68\xde\x01\xd1\x12\x02\x00\x10\x31\x34\x33\x32\x31\x33\x35\x37\x32\x32"..., 1013) = 1013 <0.000103> 
94951 23:40:19.631027 write(8, "\x01\x01\xf9\x00\x03\x01\xd4\x55\x5c\x9c\x2a\x62\xbb\x97\x68\x8c\x00\x02\xa9\x55\x5c\x9c\x2a\x62\xbb\x97\x68\x8c\x00\x02\xfc\x55"..., 509) = 509 <0.000067> 
94951 23:40:19.631187 write(8, "\x01\x03\xf1\x00\x03\x03\xcc\x55\x5c\x9c\x29\x62\xbb\x97\x68\x8c\x00\x01\x3d\x55\x5c\x9c\x2a\x62\xbb\x97\x68\x8c\x00\x02\x93\x55"..., 1013) = 1013 <0.000058> 
94951 23:40:19.631332 epoll_wait(6, {}, 128, 0) = 0 <0.000035> 
94951 23:40:19.631440 epoll_wait(6, <unfinished ...> 
92326 23:40:19.631496 <... futex resumed> ) = 0 <0.205872> 
92326 23:40:19.631554 clock_gettime(CLOCK_MONOTONIC, {1298109, 942247030}) = 0 <0.000033> 
92326 23:40:19.631664 clock_gettime(CLOCK_REALTIME, {1432136419, 631698624}) = 0 <0.000031> 
92326 23:40:19.631767 select(0, NULL, NULL, NULL, {0, 20} <unfinished ...> 
92324 23:40:19.631845 <... futex resumed> ) = 0 <0.206158> 
92324 23:40:19.631911 epoll_wait(6, {}, 128, 0) = 0 <0.000064> 
92326 23:40:19.632048 <... select resumed> ) = 0 (Timeout) <0.000248> 
92326 23:40:19.632112 clock_gettime(CLOCK_MONOTONIC, {1298109, 942807377}) = 0 <0.000031> 
92326 23:40:19.632215 futex(0x8f4de0, FUTEX_WAIT, 0, {60, 0} <unfinished ...> 
92324 23:40:19.632279 futex(0x8f5838, FUTEX_WAIT, 0, NULL <unfinished ...> 
94951 23:40:19.833623 <... epoll_wait resumed> {{EPOLLIN|EPOLLOUT, {u32=1626358040, u64=140674690203928}}}, 128, -1) = 1 <0.202147>
94951 23:40:19.833780 clock_gettime(CLOCK_MONOTONIC, {1298110, 144480859}) = 0 <0.000032> 
94951 23:40:19.833902 futex(0x8f5838, FUTEX_WAKE, 1) = 1 <0.000044> 
94951 23:40:19.834024 futex(0x8f4de0, FUTEX_WAKE, 1) = 1 <0.000042> 
94951 23:40:19.834131 read(9, "\x5c\x9c\x29\x62\xbb\x97\x68\x8c\x00\x01\x56\x63\x6f\x6e\x6e\x30\x31\x30\x31\x55\x5c\x9c\x2a\x62\xbb\x97\x68\x8c\x00\x03\x24\x63"..., 4068) = 4068 <0.000039> 
94951 23:40:19.834268 clock_gettime(CLOCK_REALTIME, {1432136419, 834305215}) = 0 <0.000030> 
94951 23:40:19.834476 clock_gettime(CLOCK_REALTIME, {1432136419, 834519537}) = 0 <0.000033> 
94951 23:40:19.834601 write(3, "2015-05-20 23:40:19.834 [DEBU] 5"..., 73) = 73 <0.000042> 
94951 23:40:19.834778 clock_gettime(CLOCK_REALTIME, {1432136419, 834819514}) = 0 <0.000032> 
94951 23:40:19.834896 clock_gettime(CLOCK_REALTIME, {1432136419, 834931088}) = 0 <0.000030> 
94951 23:40:19.835061 clock_gettime(CLOCK_REALTIME, {1432136419, 835102950}) = 0 <0.000033> 
94951 23:40:19.835180 write(3, "2015-05-20 23:40:19.834 [INFO] 5"..., 115) = 115 <0.000037>

refer to: https://www.v2ex.com/t/192510

amyangfei commented 9 years ago

tcpdump (sudo tcpdump -i lo -ntttt 'tcp port 6001' -s 0 -S)

2015-05-21 11:20:21.125435 IP 10.0.10.101.14100 > 10.0.10.101.6001: Flags [P.], seq 2265563806:2265567902, ack 2977095642, win 4, options [nop,nop,TS val 334952860 ecr 334952809], length 4096
2015-05-21 11:20:21.128627 IP 10.0.10.101.6001 > 10.0.10.101.14100: Flags [.], ack 2265567902, win 8, options [nop,nop,TS val 334952860 ecr 334952860], length 0
2015-05-21 11:20:21.329465 IP 10.0.10.101.14100 > 10.0.10.101.6001: Flags [P.], seq 2265567902:2265571998, ack 2977095642, win 4, options [nop,nop,TS val 334952911 ecr 334952860], length 4096
2015-05-21 11:20:21.330027 IP 10.0.10.101.6001 > 10.0.10.101.14100: Flags [.], ack 2265571998, win 8, options [nop,nop,TS val 334952911 ecr 334952911], length 0
2015-05-21 11:20:21.533450 IP 10.0.10.101.14100 > 10.0.10.101.6001: Flags [P.], seq 2265571998:2265576094, ack 2977095642, win 4, options [nop,nop,TS val 334952962 ecr 334952911], length 4096
2015-05-21 11:20:21.534057 IP 10.0.10.101.6001 > 10.0.10.101.14100: Flags [.], ack 2265576094, win 8, options [nop,nop,TS val 334952962 ecr 334952962], length 0

dispatcher.log

2015-05-21 11:20:20.922 [DEBU] ca8b [handleTCPConn] before read
2015-05-21 11:20:21.125 [DEBU] ca8c [handleTCPConn] end read rlen: 3515
2015-05-21 11:20:21.125 [INFO] ca8d [processPushMsg] send 39 direct msg to connector conn0101 latency: 225197.502
2015-05-21 11:20:21.127 [INFO] ca8e [processPushMsg] send 81 direct msg to connector conn0101 latency: 225199.137
2015-05-21 11:20:21.128 [INFO] ca8f [processPushMsg] send 39 direct msg to connector conn0101 latency: 225199.998
2015-05-21 11:20:21.128 [DEBU] ca90 [handleTCPConn] before get new bufio reader
2015-05-21 11:20:21.128 [DEBU] ca91 [handleTCPConn] before read
2015-05-21 11:20:21.128 [DEBU] ca92 [handleTCPConn] end read rlen: 581
2015-05-21 11:20:21.128 [DEBU] ca93 [handleTCPConn] before get new bufio reader
2015-05-21 11:20:21.128 [DEBU] ca94 [handleTCPConn] before read
2015-05-21 11:20:21.329 [DEBU] ca95 [handleTCPConn] end read rlen: 2722
2015-05-21 11:20:21.329 [INFO] ca96 [processPushMsg] send 81 direct msg to connector conn0101 latency: 225401.536
2015-05-21 11:20:21.329 [INFO] ca97 [processPushMsg] send 39 direct msg to connector conn0101 latency: 225401.630
2015-05-21 11:20:21.329 [DEBU] ca98 [handleTCPConn] before get new bufio reader
2015-05-21 11:20:21.329 [DEBU] ca99 [handleTCPConn] before read
2015-05-21 11:20:21.330 [DEBU] ca9a [handleTCPConn] end read rlen: 1374
2015-05-21 11:20:21.330 [INFO] ca9b [processPushMsg] send 81 direct msg to connector conn0101 latency: 225401.934
2015-05-21 11:20:21.330 [INFO] ca9c [processPushMsg] send 39 direct msg to connector conn0101 latency: 225402.015
2015-05-21 11:20:21.330 [DEBU] ca9d [handleTCPConn] before get new bufio reader
2015-05-21 11:20:21.330 [DEBU] ca9e [handleTCPConn] before read
2015-05-21 11:20:21.533 [DEBU] ca9f [handleTCPConn] end read rlen: 3590