freelan-developers / freelan

The main freelan repository.
http://www.freelan.org
Other
1.35k stars 201 forks source link

Ignoring PRESENTATION every minute for long time #230

Open HenryNe opened 4 years ago

HenryNe commented 4 years ago

After long running (30 days) and a short network problem, connection between 2 of 8 nodes are not establisht. This message comes every minute on both nodes:

Accepting PRESENTATION from 4.3.2.7:12000 ..
Session established with 4.3.2.7:12000.
Cipher suite: ecdhe_rsa_aes256_gcm_sha384
Elliptic curve: sect571k1
Added system route: eth0 - 4.3.2.7/32 => 188.138.112.1 - metric 0
Ignoring PRESENTATION from 4.3.2.2:12000 as an active session currently exists with this host.
Error deciphering data message from 4.3.2.7:12000: error:00000000:lib(0):func(0):reason(0)
Error deciphering data message from 4.3.2.7:12000: error:00000000:lib(0):func(0):reason(0)
Session with 4.3.2.7:12000 lost (timeout).

All other 6 connections (of totaly 7 for 8 nodes) works perfectly over all the time. After ~2 hours the connection was stable again.

The same issue have some days before, and I have fixed it by restart freelan on one host.

I feel, both nodes start the connection at the same time. Then both see the other connection and both terminate the session. After exatly 1 Minute they starts the same again.

It is possible to add a random delay before they try reconnection? Can I setup an unique delay for every host. So, they not try connections in same time interval (1 minute)?

richman1000000 commented 4 years ago

I think this is issue with your internet. I had similar issue. this 2 messages Ignoring PRESENTATION from 4.3.2.2:12000 and Error deciphering data message from 4.3.2.7:12000: error:00000000:lib(0):func(0):reason(0) are not related.

HenryNe commented 4 years ago

I should say, that I use UDP and one of the host stays behind a NAT. Is the error a side effect of concureny PRESENTATION and the UDP port over NAT?

Both hosts have good connections to 6 other nodes at this time.

If I stop freelan for 5 or more seconds on one of the hosts, and start it again, than all will kork. But, if I use "restart", then the connection comes not back again.

HenryNe commented 4 years ago

Here are last lenes of syslogs, 2 minutes before the connection establisht. The timestamps in hosts are in synch.

15:58:14 host_6: Accepting PRESENTATION from host 1 15:58:14 host_1: Ignoring PRESENTATION from host 6 15:58:24 host_1: Session lost (timeout)

15:58:46 host_1: Accepting PRESENTATION from host 6 15:58:46 host_6: Ignoring PRESENTATION from host 1 15:58:47 host_6: Session lost (timeout)

15:59:14 host_6: Accepting PRESENTATION from host 1 15:59:14 host_1: Ignoring PRESENTATION from host 6 15:59:24 host_1: Session lost (timeout)

15:59:46 host_1: Accepting PRESENTATION from host 6 15:59:46 host_6: Ignoring PRESENTATION from host 1 15:59:47 host_6: Session lost (timeout)

host_1.txt host_6.txt

richman1000000 commented 4 years ago

this error - is definitely network issue "Error deciphering data message from 4.3.2.7:12000: error:00000000:lib(0):func(0):reason(0)"

on your NAT routers did you made port forward? or you using Dynamic contacts?

HenryNe commented 4 years ago

Only one host is behind a NAT with a static port forward. All 8 hosts have 7 entries in "contact=", and all have a static IP address. See config for node1: freelan.conf.txt

HenryNe commented 4 years ago

I don't belive a network problem, because 6 other connections between the other 7 hosts have not this issue. Typicaly reconnect 1 minute after a problem.

richman1000000 commented 4 years ago

ok. try to check PMTU on between both nodes, or fix mtu withing freelan config

HenryNe commented 3 years ago

Have checked MTU between all nodes. It is 1500 every there. I used command like this

# ping -c 1 -M do -s 1472 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data.
76 bytes from 8.8.8.8: icmp_seq=1 ttl=119 (truncated)

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 22.291/22.291/22.291/0.000 ms

A negativ check:

# ping -c 1 -M do -s 1473 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1473(1501) bytes of data.
ping: local error: Message too long, mtu=1500

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms