adrienverge / openfortivpn

Client for PPP+TLS VPN tunnel services
GNU General Public License v3.0
2.7k stars 320 forks source link

Rsync Fails over opernfortivpn and ssh #184

Closed robmukai closed 7 years ago

robmukai commented 7 years ago

This may be similar to #154 so if it is please close it. I am running on Ubuntu 16.04.3. Latest version of openfortivpn compiled from source. I run a backup over SSH through the openfortvpn to a box behind a fortigate. I am in a pretty remote location and my internet is over a microwave connection. Although the signal is usually pretty good. Also, I can run this backup on a Windows 10 machine using the Forticlient from Fortinet. It does disconnect on occasion however more randomly.

What happens is, I can connect to the Fortigate through the openfortivpn just fine. I can also start the Rsync process just fine. However at the same point in the backup for each directory, it seems to "hang", and the openfortivpn closes. I back up a few different directories, and all the directories will "hang" this way. This happens using different source hard drives, and different destination hard drives.

Here is the end of the session: ` DEBUG: pppd ---> gateway (201 bytes) pppd: 00 21 45 00 00 c7 7e 51 40 00 01 11 ff d9 0a 00 01 01 ef ff ff fa cf ee 07 6c 00 b3 68 54 4d 2d 53 45 41 52 43 48 20 2a 20 48 54 54 50 2f 31 2e 31 0d 0a 48 4f 53 54 3a 20 32 33 39 2e 32 35 35 2e 32 35 35 2e 32 35 30 3a 31 39 30 30 0d 0a 4d 41 4e 3a 20 22 73 73 64 70 3a 64 69 73 63 6f 76 65 72 22 0d 0a 4d 58 3a 20 31 0d 0a 53 54 3a 20 75 72 6e 3a 64 69 61 6c 2d 6d 75 6c 74 69 73 63 72 65 65 6e 2d 6f 72 67 3a 73 65 72 76 69 63 65 3a 64 69 61 6c 3a 31 0d 0a 55 53 45 52 2d 41 47 45 4e 54 3a 20 47 6f 6f 67 6c 65 20 43 68 72 6f 6d 65 2f 36 31 2e 30 2e 33 31 36 33 2e 39 31 20 4c 69 6e 75 78 0d 0a 0d 0a

DEBUG: pppd ---> gateway (201 bytes) pppd: 00 21 45 00 00 c7 7e b7 40 00 01 11 ff 73 0a 00 01 01 ef ff ff fa cf ee 07 6c 00 b3 68 54 4d 2d 53 45 41 52 43 48 20 2a 20 48 54 54 50 2f 31 2e 31 0d 0a 48 4f 53 54 3a 20 32 33 39 2e 32 35 35 2e 32 35 35 2e 32 35 30 3a 31 39 30 30 0d 0a 4d 41 4e 3a 20 22 73 73 64 70 3a 64 69 73 63 6f 76 65 72 22 0d 0a 4d 58 3a 20 31 0d 0a 53 54 3a 20 75 72 6e 3a 64 69 61 6c 2d 6d 75 6c 74 69 73 63 72 65 65 6e 2d 6f 72 67 3a 73 65 72 76 69 63 65 3a 64 69 61 6c 3a 31 0d 0a 55 53 45 52 2d 41 47 45 4e 54 3a 20 47 6f 6f 67 6c 65 20 43 68 72 6f 6d 65 2f 36 31 2e 30 2e 33 31 36 33 2e 39 31 20 4c 69 6e 75 78 0d 0a 0d 0a

DEBUG: pppd ---> gateway (25 bytes) pppd: c0 21 05 02 00 17 50 65 65 72 20 6e 6f 74 20 72 65 73 70 6f 6e 64 69 6e 67

DEBUG: pppd ---> gateway (25 bytes) pppd: c0 21 05 03 00 17 50 65 65 72 20 6e 6f 74 20 72 65 73 70 6f 6e 64 69 6e 67

ERROR: read: Input/output error INFO: Cancelling threads... INFO: Setting ppp interface down. INFO: Restoring routes... DEBUG: ip route del to XX.XXX.XXX.XXX/255.255.255.255 via XXX.XXX.X.X dev wlp2s0 INFO: Removing VPN nameservers... DEBUG: Waiting for pppd to exit... DEBUG: waitpid: pppd exit status code 16 INFO: Terminated pppd. INFO: Closed connection to gateway. DEBUG: Gateway certificate validation failed. DEBUG: Gateway certificate digest found in white list. INFO: Logged out. `

The last pppd message is: À!Peer not responding

The ERROR: read: Input/output error is the same as #154 , but the cause is different. Any and all help is appreciated. I am willing to do any testing that may be required.

DimitriPapadopoulos commented 7 years ago

This message is printed by code recently added to openfortivpn (74dc069): DEBUG: waitpid: pppd exit status code 16

According to the pppd documentation it means:

The link was terminated because the peer is not responding to echo requests.

For some reason pppd is not able to reach its peer - the gateway. Could be a pppd error in the worst case, or problems with the VPN tunnel itself.

I really don't know how a microwave connection works. Is this a little bit like Wi-Fi, where the connection could be reset, resulting for example in a new DHCP lease? If so, could you check the logs of the microwave connection and find whether something happened when pppd failed?

robmukai commented 7 years ago

@DimitriPapadopoulos The microwave connection is pretty transparent. The antenna connects directly into my wifi router. There really isn't anything on my end to look at. Looking at the router logs, there isn't anything that jumps out.

If I do restart the openfortivpn, the rsync continues until it hits another random spot, then the vpn dies again with the same message. So it appears that the openfortivpn is somehow losing a connection maybe? Or maybe it is timing out too quickly?

DimitriPapadopoulos commented 7 years ago

Among possible causes:

  1. a timeout somewhere (but not in openfortivpn, possibly pppd),
  2. a pppd bug, but then it's a widely used piece of software so I doubt it,
  3. an openfortivpn bug, where openfortivpn fails unbeknownst to pppd,
  4. need to set some network parameters of the MTU and MRU kind, possibly related to the "exotic" microwave router.

I can't help much. Unless some other maintainer can help, I can only suggest:

  1. Instead of comparing openfortivpn/Ubuntu (VPN SSL only) with FortiClient/Windows (IPSec by default), you could you compare openfortivpn/Ubuntu with FortiClient/Ubuntu, or alternatively FortiClient/Windows in SSL mode - not the IPSec default.
  2. You could also compare rsync with and without VPN (use a different destination server if you have to).
robmukai commented 7 years ago

@DimitriPapadopoulos Thanks for following up.

On your two suggestions

  1. openforticlientvpn/Ubuntu and Forticlient/Ubuntu show the same behavior. FortiClient/Windows is set to SSL-VPN and seems to work.
  2. I'll have to see if I can find a machine to rsync to. I'll let you know what I come up with.

Thanks for your help!

DimitriPapadopoulos commented 7 years ago

Thank you for trying these suggestions.

  1. Since openforticlientvpn/Ubuntu and FortiClient/Ubuntu share the same behavior, this is probably not an openfortivpn bug - at worst this is a "feature" shared by both clients! More seriously, my gut feeling is that this is related to networking parameters (such as MRU and MTU) - cause 4 in my list of possible causes above. Since FortiClient/Windows in VPN-SSL mode does not share the same behavior and works properly, it could be these networking parameters are properly set on Windows. Could be interesting to investigate network settings on either systems - not sure how to collect these settings out of my head though.

  2. If a direct rsync doesn't work, then that's definitely a network issue you need to debug without VPN. But I believe it will work. By the way it would have been better to rsync to the same server with/without VPN - but that's probably not possible since you need a VPN in the first place! If I understand correctly, the problem is that each additional encapsulation add its own extra payload in packets, which means you may need lower initial MTU/MRU values to leave room for that additional payload. Unfortunately all this happens in network layer 2 (data link), which I'm even less familiar with than layer 3....

DimitriPapadopoulos commented 7 years ago

You could perhaps try this recipe, where you increase the size of packets sent by ping until it stops working: Troubleshooting MTU size over IPSEC VPN

robmukai commented 7 years ago

I'll try playing around with that see if that makes a difference. Thanks for the ideas.

robmukai commented 7 years ago

Ok, so I'm not sure what I am looking at but a ping -M do -s 1326 XXX.XXX.XXX.XXX

Gives a good ping PING XXX.XXX.XXX.XXX (XXX.XXX.XXX.XXX) 1326(1354) bytes of data. 1334 bytes from XXX.XXX.XXX.XXX: icmp_seq=1 ttl=63 time=274 ms

ping -M do -s 1327 XXX.XXX.XXX.XXX ping: local error: Message too long, mtu=1354

So what would you suggest I set the MTU on the Wifi Connection at?

DimitriPapadopoulos commented 7 years ago

I think the MTU is set on the inner encapsulated layer (here that would be pppd?) but again I'm not a specialist. For pppd the MTU can be set in the relevant options file of pppd (probably somewhere in /etc/ppp) or passed as a parameter to pppd (for that the openfortivpn code would need to be modified to pass proper options to pppd). So I'd give a try to modifying options using a pppd option file, probably under /etc/ppp). Unfortunately I don't have time to help much more right now, I don't know how to set options in pppd.

DimitriPapadopoulos commented 7 years ago

I'd try setting the MTU on the Wi-Fi only if rsync and/or ping fail also without VPN.

robmukai commented 7 years ago

@DimitriPapadopoulos I think we can close this. After testing for a day, changing the MTU on the WIFI connection to 1326 makes it work as well as the windows version does on my connection. Which is to say, it still closes, but only randomly and after it has run for a long time. Thanks for your help in thinking this through!

DimitriPapadopoulos commented 7 years ago

@robmukai Thank you for coming back to us. This will hopefully help other users of the software.

This does look like an issue with your network setup after all, however I'm not 100% certain there's nothing we could to help within openfortivpn - such as adding an option to set MTU for pppd or at least writing a paragraph about MTU in the documentation.

Also, how long is a long time in your case? Please note that there's a default timeout on the FortiGate server - set by default at 8 hours if I recall correctly.

robmukai commented 7 years ago

@DimitriPapadopoulos I'm wondering about that. I'd be surprised if Windows 10 handles fragmented packets better than Ubuntu? If that is not the case, Is there something in the way that openfortivpn handles fragmented packets that causes the shut down? Don't know the answer to that, but the work around seems to be working well.

Not sure what a "Long Time" is. I usually run it over night, and it is down when I get to it in the morning. However, large files (as in GB sized files) have been transferred. It does occasionally drop in less than 8 hours as well, but that could be due to instability on the Microwave connection. Is there a way to log uptime on the connection? I'll see what the timeout is set for on the FortiGate. Also, is there a reason for the default timeout on the Fortigate?

DimitriPapadopoulos commented 7 years ago

@robmukai I doubt Ubuntu cannot handle fragmented packets as well as Windows. I've read in some of the web pages I've read these last days that fragmented packets may be dropped by firewalls because they are a security issue (DoS) - in this case the Fortigate could drop the fragmented packets.

It could just be that the MTU is set correctly on Windows but not Ubuntu. Perhaps because there's some sort of driver for the microwave link on the Windows machine - which could perhaps properly set the MTU at 1326.

About the timeout, it's best to have a look at the logs FortiGate-side and check whether it shows a reason for the connection closing. Perhaps Forti support can help. Also ask them the rationale behind the FortiGate-side timeout.

robmukai commented 7 years ago

@DimitriPapadopoulos So I ran a quick check on the Windows 10 box and get this:

netsh interface ipv4 show subinterfaces

MTU MediaSenseState Bytes In Bytes Out Interface


1354 1 3372 35244 fortissl 4294967295 1 1304 46329 Loopback Pseudo-Interface 1 1500 5 0 0 Ethernet 4 1500 1 13897964 1967937 Wi-Fi 3 1500 5 0 0 Ethernet 5 1500 5 0 0 Local Area Connection* 18

So the connection for the Forticlient MTU is 1354 (less 28 is 1326) So somewhere, it is setting the MTU correctly in windows. Not sure if it is the forticlient or windows itself doing it. You'll notice that the Wi-Fi connection is at 1500.

The microwave connection is completely transparent to the machines downstream. I actually run a small Inn and my guests don't have to do anything special to connect, and they bring all manner of devices from windows, apple, android, etc. I don't have any drivers or anything installed for it.

I'll keep an eye on the connecttion. If I can find the time it drops out, I can have my buddy, who owns the fortigate, check the logs to see what the fortigate sees at the time of disconnect.

Thanks so much for your help!

DimitriPapadopoulos commented 7 years ago

@robmukai Great to have all this information, it will help debug future issues.

For future reference, let's recap what we know so far:

What I don't know is whether MTU should always be set to a value lower than 1500, or only sometimes depending on MTU values along the path.

Also should MTU be set to a constant value, and if so which one, or variable values depending on MTU values along the path? In the latter case, how to discover MTU values along the path?

Other sources refer to setting MSS, not MTU.

On Linux there are tools to discover MTU values along the path. See for example tracepath. I have also read MTU woes in IPsec tunnels and how you can fix it and Path MTU discovery in practice and although I don't have time to really understand it, setting MTU does not seem that a robust technique after all...

Some links:

robmukai commented 7 years ago

@DimitriPapadopoulos I agree with all 5 of your bullet points. Unfortunately, your questions go beyond my abilities. I guess my thought would be to see if we can figure out how FortiClient/Windows figures out the MTU and how it lowers it on the tunnel.

DimitriPapadopoulos commented 7 years ago

@robmukai For what it's worth, I've just looked up MTU values of the different interfaces on Ubuntu 16.04 LTS :

The MTU of the PPP connection is set to 1354 automatically. I haven't had to force or specify anything here. Isn't that the case when you run openfortivpn?

robmukai commented 7 years ago

@DimitriPapadopoulos Ok this is really weird. So I reset my MTU on the wifi conntection back to "auto" and now the ppp0 connections is showing an MTU:1354 as well. So funny thing, it works without changing the MTU now. Not sure what would have caused it to not work before?

DimitriPapadopoulos commented 7 years ago

Strange indeed. As far as i can see, openfortivpn does not set the MTU. It has to be handled by pppd.

From the PPPD(8) man page:

mtu n Set the MTU [Maximum Transmit Unit] value to n. Unless the peer requests a smaller value via MRU negotiation, pppd will request that the kernel networking code send data packets of no more than n bytes through the PPP network interface. Note that for the IPv6 protocol, the MTU must be at least 1280.

My guess is that this was a problem with path MTU discovery over pppd. Sometimes MTU discovery doesn't work because of poorly configured “security” appliances. Perhaps something changed along the path?

DimitriPapadopoulos commented 7 years ago

If MTU discovery does not work as expected, users should probably work around the issue in the software responsible for MTU discovery, namely pppd as far as I can tell.

So one answer might be that this is not openfortivpn issue.

On the other hand openfortivpn could have an option to force MTU, that would simply be passed to pppd as option mtu.

DimitriPapadopoulos commented 7 years ago

For the record the MRU is actually set by openfortivpn to 1354:

        char *args[] = {
            "/usr/sbin/pppd", "38400", "noipdefault", "noaccomp",
            "noauth", "default-asyncmap", "nopcomp", "receive-all",
            "nodefaultroute", ":1.1.1.1", "nodetach",
            "lcp-max-configure", "40", "mru", "1354",
            NULL, NULL, NULL, NULL,
            NULL, NULL, NULL, NULL,
            NULL
        };

This a mystery! I'll close the issue for now, but do not hesitate to come back to us if needed.

CristianCardosoA commented 7 years ago

Iḿ using Fedora 26. I installed version 1.5.0

Iḿ getting this issue: pppd: The link was terminated because the peer is not responding to echo requests.

What can I do ?

DimitriPapadopoulos commented 7 years ago

Two things you can do :-)

  1. Nothing if it works for you - I'm aware of this error message and I will find a workaround.
  2. Open an new issue if it doesn't work for you.