Open jakubgs opened 6 years ago
The strace
doesn't look strange:
sendto(9, "\363\223\263\202\322M\336\361\263\27\255\304\355L\365\257\34N\236\263\316\355\334\200\322\24B\331\"\240\245\352"..., 400, 0, NULL, 0) = 400
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=434335}) = 56 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=434317})
recvfrom(9, "_a\r\21\22\206H*8\256\315\26\177\2\31NAV7H(}>\10\277v\257P\276\367\353`"..., 2169, 0, NULL, NULL) = 2169
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=428349}) = 57 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [9 10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=428328})
sendto(9, "@\23\267\3077\204\264\t\351;\300\305\3533\"\346\02110\357\254}\355\322\36\\\10\2346\3649p"..., 488, 0, NULL, 0) = 488
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=427833}) = 56 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=427823})
recvfrom(9, "a\321\264\271 \344\366\234v\204\5\335<\316\213\204\5:\"qR\352\34\r\6\357O\257w[\35V"..., 2111, 0, NULL, NULL) = 2111
mremap(0x7f4df1ae9000, 12382208, 12386304, MREMAP_MAYMOVE) = 0x7f4df1ae9000
mremap(0x7f4dfafe7000, 12591104, 12595200, MREMAP_MAYMOVE) = 0x7f4dfafe7000
mremap(0x7f4dfa23c000, 12566528, 12570624, MREMAP_MAYMOVE) = 0x7f4dfa23c000
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=419109}) = 57 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [9 10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=419081})
sendto(9, "\240\312\251*\343\34r\210\247\251\200J\21G\n\223\207b\317\211}\22\313[\265.\210\v\226\2\3333"..., 265, 0, NULL, 0) = 265
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=418443}) = 56 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=418430})
recvfrom(9, "\252\301X \356\36d\264{\0325\230B\233H\2\255X\215\34.\r\316F\211\262\3\247\26^[\r"..., 2118, 0, NULL, NULL) = 2118
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=411550}) = 57 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [9 10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=411527})
sendto(9, "\245\366\0300\350\313\32p\227\211\235\246\211$\214\244\242T\366ds\\\357)\2003\270b\17\352\232K"..., 133, 0, NULL, 0) = 133
select(69, [4 5 6 7 8 9 10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59 68], [10 11 12 13 14 15 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], NULL, {tv_sec=0, tv_usec=411013}) = 56 (in [9 10 13 14 15 19 24 25 27 32 36 37 38 39 40 41 42 53 54 55 56 57 58 59], out [10 12 13 14 15 17 19 20 21 23 24 25 26 27 30 32 34 35 36 37 38 39 40 41 42 53 54 55 56 57 58 59], left {tv_sec=0, tv_usec=411003})
recvfrom(9, "\374C\372\246\371\370}q'\10\256\23\305\364\26\311\301\352s\254\242\32\260\245\212&\346)\245x\364\271"..., 2157, 0, NULL, NULL) = 2157
mremap(0x7f4df1ae9000, 12386304, 12390400, MREMAP_MAYMOVE) = 0x7f4df1ae9000
But I can't tell.
With --debug=5
I'm getting:
tincd[7131]: Got ADD_EDGE from host1 (1.2.3.4 port 54734): 12 6d433d1f host6 host2 2.3.4.5 655 f 268
tincd[7131]: Got ADD_EDGE from host1 (1.2.3.4 port 54734) which does not match existing entry
tincd[7131]: Forwarding ADD_EDGE from host1 (1.2.3.4 port 54734): 12 6d433d1f host6 host2 2.3.4.5 655 f 268
tincd[7131]: Sending 99 bytes of metadata to host5 (35.226.21.19 port 39934)
tincd[7131]: Got ADD_EDGE from host1 (1.2.3.4 port 54734): 12 6526e2de host4 host3 3.4.5.6 655 f 181
tincd[7131]: Got ADD_EDGE from host1 (1.2.3.4 port 54734) which does not match existing entry
Even tough the hosts are clearly in the configuration.
Okay, I think I've identified something that might be causing this. I can see ridiculous amount of following messages:
Sending 105 bytes of metadata to
All over the place after the 100% CPU usage starts. And by ridiculous amount I mean 10-30 thousand per second.
~$ grep '14:00:01' /var/log/syslog | grep 'bytes of metadata' | wc -l
2462
~$ grep '14:00:02' /var/log/syslog | grep 'bytes of metadata' | wc -l
14856
~$ grep '14:00:03' /var/log/syslog | grep 'bytes of metadata' | wc -l
14001
~$ grep '14:00:04' /var/log/syslog | grep 'bytes of metadata' | wc -l
17245
~$ grep '14:00:06' /var/log/syslog | grep 'bytes of metadata' | wc -l
22863
~$ grep '14:00:10' /var/log/syslog | grep 'bytes of metadata' | wc -l
30810
And this happened RIGHT after I saw:
Got KEY_CHANGED from
Twice for every single host in the network. All at the same time.
I can see there is a KeyExpire
option in the config that defaults to 3600
, so every hour.
The documentation doesn't say anything about this but if I set this to 0
would key expiry be disabled?
Instead I'll set it to something very high and see if it staves off the symptoms. If it does it would indicate that key expiry is at fault here.
I'm looking at the code and I don't get why they would all synchronize like that when sending the new keys:
https://github.com/gsliepen/tinc/blob/master/src/net.c#L507
Although to be fair, I was testing the 1.1pre15
version on many hosts, maybe that's differently implemented.
Nah, went back on all to 1.0.33 (Ubuntu 18.04) and it still happens.
Seems like KEY_CHANGED
thing was a red herring, I've tried using:
KeyExpire = 604800
On all hosts, but the CPU issue still happened after a while.
This is how it looks like to me:
It seems to be doing an awful lot of writing:
1603047.059 ( 0.078 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) ...
1603047.137 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee340, count: 86 ) = 86
LOST 1 events!
1603047.153 ( 0.116 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) ...
1603047.269 ( 0.009 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1df0450, count: 79 ) = 79
1603047.297 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603047.357 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee370, count: 115 ) = 115
LOST 4 events!
1603047.376 ( 0.062 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603047.567 ( 0.008 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1df0450, count: 79 ) = 79
1603047.591 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603047.650 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee370, count: 115 ) = 115
1603047.672 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
LOST 4 events!
1603047.857 ( 0.008 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1df0450, count: 79 ) = 79
1603047.881 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603048.012 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee370, count: 115 ) = 115
1603048.034 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603048.090 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee340, count: 86 ) = 86
LOST 2 events!
1603048.251 ( 0.010 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1df0450, count: 79 ) = 79
1603048.279 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603048.341 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee370, count: 115 ) = 115
LOST 4 events!
1603048.419 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603048.555 ( 0.007 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1df0450, count: 79 ) = 79
1603048.579 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603048.650 ( 0.006 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1df0450, count: 79 ) = 79
1603048.670 ( 0.005 ms): write(fd: 2<socket:[960674]>, buf: 0x7f89a661e703, count: 1 ) = 1
1603048.726 ( 0.007 ms): write(fd: 2<socket:[960674]>, buf: 0x7ffdc1dee370, count: 115 ) = 115
I think I've found the secret sauce. I'm currently using the following options:
Forwarding = off
DirectOnly = yes
TunnelServer = yes
And I haven't seen the CPU spike so far.
What is your theory of why those options help?
It's dark magic and i have no idea.
My vague hypothesis is that the CPU issues coincided with a lot of Metadata socket read error for
, a metric shitload of Sending X bytes of metadata
, and a sprinkling of Forwarding ADD_EDGE from
, which suggests to me that this issue is related to internal(meta) Tinc traffic, rather than actual traffic generated by the hosts.
Based on that I've decided to use the 3 options because it seems to me like they would reduce meta traffic, especially across multiple nodes forwarding them(I was suspecting a loop):
Forwarding = off - Incoming packets that are not meant for the local node, but which should be forwarded to another node, are dropped.
DirectOnly = yes - When this option is enabled, packets that cannot be sent directly to the destination node, but which would have to be forwarded by an intermediate node, are dropped instead. When combined with the IndirectData option, packets for nodes for which we do not have a meta connection with are also dropped.
TunnelServer = yes - When this option is enabled tinc will no longer forward information between other tinc daemons, and will only allow connections with nodes for which host config files are present in the local /etc/tinc/NETNAME/hosts/ directory. Setting this options also implicitly sets StrictSubnets.
I think one of those or some of them or all of them have stopped some kind of metadata loop storm.
Were you already using TunnelServer = yes
on one or more of the nodes when this issue first occured?
I'm pretty sure I was not. but I can check later in my git history to verify.
@jakubgs can you check the repository? @gsliepen, what is your theory on the issue? I am planning similarly large deployment and would like to know how likely is it to occur, and what would take to remedy this? Were you able to reproduce the bug?
Were you already using TunnelServer = yes on one or more of the nodes when this issue first occured?
No, adding it is what fixed it.
This issue needs more investigation, although from https://github.com/gsliepen/tinc/issues/202#issuecomment-867195496, jakubgs no longer uses tinc.
I am also observing this issue in my network: Sometimes there is very high physical traffic without much virtual traffic, high CPU load by tinc and a very high latency.
I tried to set TunnelServer = yes
but now hosts that cannot setup a direct connection cannot communicate with each other, although all involved nodes have all host information available now in /etc/tinc/NETNAME/hosts
. Only nodes of which one has a ConnectTo
statement to the other can communicate over tinc.
Is this an expected effect of TunnelServer = yes
? If I understand the manpage correctly, this option only disables the forwarding of meta data but still allows forwarding of packets not meant for the local host. Is a ConnectTo
statement required to communicate with a host if TunnelServer = yes
?
Is there any way I can assist in identifying the root issue of this?
Right now with ~500 nodes I have to restart the entire network roughly once a week (usually around 4 am), and I really do fear that things could get worse if I add more nodes in the future.
Do you have high CPU/MEM usage or just "Connection reset by peer" ?
Do you have high CPU/MEM usage or just "Connection reset by peer" ?
I also have a very high CPU usage and exploding ping times between hosts (normally, servers have a ping of 3 ms or less; during such events this goes up to many seconds). I didn't monitor the memory usage though.
We found a solution for "Connection reset by peer"... I don't think it's the best solution but it's works
tcpdump -n' -i eth0 'tcp port 655 and tcp[tcpflags] == tcp-syn'
That command will print all new connection from a host on port 655. If there to many connection from a host we just block the ip address with iptable itpables -A INPUT -s ..... -p tcp,udp -dport 655 -REJECT for 30 minutes...
Maybe that can help you ?
That command will print all new connection from a host on port 655. If there to many connection from a host we just block the ip address with iptable itpables -A INPUT -s ..... -p tcp,udp -dport 655 -REJECT for 30 minutes...
Oh, sounds like this is worth a try! May I ask how you automated this? With a custom python script?
we're using nodejs but you can do it in python. You need to flush tcpdump output so the real command is:
stdbuf -i0 -o0 -e0 tcpdump -n -i eth0 'tcp port 655 and tcp[tcpflags] == tcp-syn'
Then on each line you will have (split by \n) you can get remote address. The line will looks like:
17:58:44.476135 IP [REMOTE ADDRESS] > [YOUR PUBLIC ADDRESS].655: Flags [S], seq 858164035, win 29200, options [mss 1386,sackOK,TS val 1168206 ecr 0,nop,wscale 7], length 0
Then you creapte the iptables rules
iptables -A INPUT -i eth0 -s [REMOTE ADDRESS] -p tcp --dport 655 -j REJECT
Hope it can help
Awesome, thank you very much! I'll definitely try this.
don't forget to remove the rule after some times ^^
don't forget to remove the rule after some times ^^
Haha, true. This would make some ppl angry otherwise.
Just for reference, I am now using an iptables rule like that:
iptables -A INPUT -p tcp -m tcp --dport tinc -m hashlimit --hashlimit-name tinc_tcp --hashlimit-mode srcip --hashlimit-upto 5/min --hashlimit-burst 2 -j ACCEPT
This will only accept 5 requests per minute per source IP with a burst of 2. Until now, it seems to work.
Hey,
I've been trying to figure out an issue with Tinc running on my servers that I've discovered yesterday. I have 63 hosts inter-connected via Tinc and I'm getting 100% CPU usage and growing memory usage from Tinc after I start it on most of the hosts:
And when I stop most of the Tinc services on servers the CPU load does not go down. I need to stop the processes myself. But when I start them back up again I get insane high CPU usage again. I have no idea what is causing this, and this makes Tinc essentially unusable.