Something adds a ppp0 route to the gateway making the connection fail

pedrocr commented 4 years ago

I configured a VPN using Ubuntu 19.10 and network-manager-l2tp 1.2.10. After struggling with enabling the correct phase1 and phase2 settings I now get a connection. That connection gets torn down after less than a minute.

Looking at the routing table I see that something adds a ppp0 route to the IP of the gateway I am connecting to. That breaks the communication to the gateway as that route is above the one that uses the correct wireless interface to make it continue to work. Any ideas of what could be happening?

pedrocr commented 4 years ago

This seems like the same issue as #22 but my routing config is much simpler. It's just a simple wireless connection. Deleting that route makes the VPN not fail.

dkosovic commented 4 years ago

Using uk.freel2tpvpn.com (i.e. 87.117.247.187) from https://www.freel2tpvpn.com/ , I have the following routes:

Before VPN connection:

$ ip route list
default via 192.168.0.1 dev eno1 proto dhcp metric 100 
192.168.0.0/24 dev eno1 proto kernel scope link src 192.168.0.74 metric 100

After VPN connection:

$ ip route list
default dev ppp0 proto static scope link metric 50 
default via 192.168.0.1 dev eno1 proto dhcp metric 100 
1.0.0.1 dev ppp0 proto kernel scope link src 10.20.0.10 metric 50 
87.117.247.187 via 192.168.0.1 dev eno1 proto static metric 100 
169.254.0.0/16 dev ppp0 scope link metric 1000 
192.168.0.0/24 dev eno1 proto kernel scope link src 192.168.0.74 metric 100 
192.168.0.1 dev eno1 proto static scope link metric 100

dkosovic commented 4 years ago

Similarly with other L2TP/IPsec servers I connect to including my workplace, there is no route that is equivalent to the following:

route add <IP Adress of Gateway> ppp0

puleglot commented 4 years ago

I have the same issue. Looks like this happens when VPN address and internal point-to-point addresses are the same:

pppd[1834278]: local  IP address YY.YY.YY.YY
pppd[1834278]: remote IP address XX.XX.XX.XX
...
Data: VPN Gateway: XX.XX.XX.XX
Data: Tunnel Device: "ppp0"
Data: IPv4 configuration:
Data:   Internal Address: YY.YY.YY.YY
Data:   Internal Prefix: 32
Data:   Internal Point-to-Point Address: XX.XX.XX.XX
Data:   Static Route: XX.XX.XX.XX/32   Next Hop: 0.0.0.0
Data:   Internal DNS: ..................
Data:   Internal DNS: ..................
Data:   DNS Domain: '(none)'
Data: No IPv6 configuration
VPN plugin: state changed: started (4)

pedrocr commented 4 years ago

That's very interesting. Seems like a weird VPN config if "Internal Point-to-Point Address" is the gateway address but maybe not invalid.

So is it pppd doing the wrong thing then? Or are those lines from something else?

puleglot commented 4 years ago

So is it pppd doing the wrong thing then?

I think pppd accepts this address from the remote peer via IPCP and then NetworkManager adds a route to it. This is a VPN on some cisco hardware btw, and Windows clients are connecting without problem.

puleglot commented 4 years ago

OK, this is what happens on my system after connecting:

$ ip r l
...
195.XX.XX.XX dev ppp0 proto kernel scope link src 10.3.3.33 metric 50 
195.XX.XX.XX via 192.168.1.1 dev p9p1 proto static metric 100 
$ ip a l
...
12: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1400 qdisc fq_codel state UNKNOWN group default qlen 3
    link/ppp 
    inet 10.3.3.33 peer 195.XX.XX.XX/32 scope global ppp0
       valid_lft forever preferred_lft forever

where 195.XX.XX.XX is the same address everywhere and it is also a VPN gateway address (and LNS address in terms of l2tp).

pedrocr commented 4 years ago

Looks exactly like what happens with this VPN as well. Doesn't the VPN fail in less than a minute from this? At least here this makes the gateway not accessible and the link tears down after only a little bit.

puleglot commented 4 years ago

Yes, I need to remove that bad route, otherwise the VPN dies.

pedrocr commented 4 years ago

Seems like exactly the same issue then. See the /etc/network/if-up.d/ workaround to do it automatically if you haven't already.

So is this a l2tp issue or a general network-manager issue then?

puleglot commented 4 years ago

The following patch works for me. Maybe this can be made optional via a checkbox in GUI, something like "Ignore remote peer address"?

diff --git a/src/nm-l2tp-pppd-plugin.c b/src/nm-l2tp-pppd-plugin.c
index b7912e9..ab79772 100644
--- a/src/nm-l2tp-pppd-plugin.c
+++ b/src/nm-l2tp-pppd-plugin.c
@@ -173,11 +173,7 @@ nm_ip_up (void *data, int arg)
     * address up, at which point prefer the local options remote address,
     * and if that's not right, use the made-up address as a last resort.
     */
-   if (peer_opts.hisaddr && (peer_opts.hisaddr != pppd_made_up_address)) {
-       g_variant_builder_add (&builder, "{sv}",
-                              NM_VPN_PLUGIN_IP4_CONFIG_PTP,
-                              g_variant_new_uint32 (peer_opts.hisaddr));
-   } else if (opts.hisaddr) {
+   if (opts.hisaddr) {
        g_variant_builder_add (&builder, "{sv}",
                               NM_VPN_PLUGIN_IP4_CONFIG_PTP,
                               g_variant_new_uint32 (opts.hisaddr));

This is how connection looks like after the patch:

18: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1400 qdisc fq_codel state UNKNOWN group default qlen 3
    link/ppp 
    inet 10.3.3.175/32 brd 10.3.3.175 scope global noprefixroute ppp0
       valid_lft forever preferred_lft forever

and it works fine.

pedrocr commented 4 years ago

is there any situation where this shouldn't be done? Seems like having an option for "bork my connection if the gateway asks".

puleglot commented 4 years ago

is there any situation where this shouldn't be done?

I think there are configurations where remote peer address acts as a gateway. However in my case I can just add routes like ip r add XX.XX.XX.XX dev ppp0 and they will work.

pedrocr commented 4 years ago

If the remote peer address that is supposed to act as the gateway is the same one that the VPN connects to in the first place that will never work.

dkosovic commented 4 years ago

I'm not sure about adding a GUI option, ideally it should be with the rest of the routing options in the IPv4 settings which this VPN plug-in has no possibility of modifying the GUI widgets. The next likely place would be the PPP Options dialog box. The next question is how to the get the GUI setting to the nm-l2tp-ppd-plugin via the existing D-Bus interface.

Any support for GUI changes would have to be backwards compatible with other GUI frontends (e.g. KDE plasma-nm, deepin, etc).

Annoyingly all the other PPP NetworkManager VPN implementation have practically the same code for the nm_ip_up() function which adds the route, even NetworkManager itself, e.g.:

https://github.com/NetworkManager/NetworkManager/blob/master/src/ppp/nm-pppd-plugin.c#L197

pedrocr commented 4 years ago

I don't think a GUI option is needed. Just not adding the route if the IPs are the same should always be the correct behavior. Unless I'm missing something.

dkosovic commented 4 years ago

The problem is that the nm-l2tp-ppd-plugin doesn't know what the gateway address is (as far as I can tell), so it makes it difficult to compare to the server side PTP address. So the question is how to the get the gateway address to the nm-l2tp-ppd-plugin via the existing D-Bus interface or some other means.

I'm still thinking about it, but will be happy for any patch or GitHub pull request.

dkosovic commented 4 years ago

Commit https://github.com/nm-l2tp/NetworkManager-l2tp/commit/95fdaa6dc8348ba7f63bcf7aa2ccc95b762c491d should hopefully fix this issue.

Extract from the pppd man page :

_:_ Set the local and/or remote interface IP addresses. Either one may be omitted. ... The remote address will be obtained from the peer if not specified in any option. Thus, in simple cases, this option is not required. If a local and/or remote IP address is specified with this option, pppd will not accept a different value from the peer in the IPCP negotiation, unless the ipcp-accept-local and/or ipcp-accept-remote options are given, respectively.

ipcp-accept-remote With this option, pppd will accept the peer's idea of its (remote) IP address, even if the remote IP address was specified in an option.

I used :<gateway_IP_address> and ipcp-accept-remote in the generated ppp options file so that <gateway_IP_address> could be passed to nm-l2tp-ppd-plugin where I could make sure the broken route to the <gateway_IP_address> is no longer added.

I deleted some comments in this issue as they weren't adding anything and to make it easier to read the actual issue.

This fix will be in the new NetworkManager-l2tp 1.8.4 which I hope to release within the next couple of weeks.

tukusejssirs commented 2 years ago

@dkosovic, I am experiencing this issue too (additional route). I use Arch Linux, however, networkmanager-l2tp is out of date (1.20.0-3), therefore I wanted to test it out using the PKGBUILD below.

PKGBUILD

```bash pkgname=networkmanager-l2tp-1.20.4 pkgver=1.20.4 _pppver=2.4.9 pkgrel=2 pkgdesc='L2TP support for NetworkManager' arch=(x86_64) url='https://github.com/nm-l2tp/NetworkManager-l2tp' license=(GPL2) depends=(libnma libsecret openssl "ppp=$_pppver" xl2tpd) # aur makedepends=(intltool python git) optdepends=('strongswan: IPSec support') provides=("networkmanager-l2tp=$pkgver") conflicts=(networkmanager-l2tp networkmanager-l2tp-git) source=("git+$url#tag=1.20.4") sha256sums=('SKIP') prepare() { ln -sf NetworkManager-l2tp $pkgname cd $pkgname NOCONFIGURE=1 ./autogen.sh } build() { cd $pkgname ./configure \ --libexecdir=/usr/lib/NetworkManager \ --localstatedir=/var \ --prefix=/usr \ --sysconfdir=/etc \ --with-pppd-plugin-dir=/usr/lib/pppd/$_pppver sed -i -e 's/ -shared / -Wl,-O1,--as-needed\0/g' libtool make } package() { make -C $pkgname DESTDIR="$pkgdir" install install -Dm644 $pkgname/NEWS "$pkgdir/usr/share/doc/$pkgname/NEWS" } ```

The additional route ($vpn_ip dev ppp0 proto kernel scope link src $remote_ip) is still created.

Is this issue fixed? Is there anything else I am missing?

Thanks! :pray:

Update: How to use :<gateway_IP_address> within nmcli? Is there a way to configure this for NetworkManager connections?

dkosovic commented 2 years ago

For the Arch Linux PKGBUILD (and unrelated to the routing issue) I would recommend the following changes :

Add --with-gtk4 configure switch, otherwise can not edit connections with gtk4 based gnome-control-center.
Add --enable-libreswan-dh2 configure switch as AUR libreswan package is built with DH2 (modp1024) support.
Remove intltool from the make depends as it hasn't been required since commit https://github.com/nm-l2tp/NetworkManager-l2tp/commit/219123de661dc16eb4f79b7ca7476fdc63a839d3

NetworkManager 1.36 had a major overhaul of the way it handles IP configurations internally, including routing.

The issue you are experiencing is not the same, but similar, the original was for a ppp0 metric 50 route, the current one is regarding a ppp0 metric 0 (or no metric) route as described in the following NetworkManager >= 1.36 issue:

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/961

I originally closed this issue with commit https://github.com/nm-l2tp/NetworkManager-l2tp/commit/95fdaa6dc8348ba7f63bcf7aa2ccc95b762c491d (and there was a code clean-up commit https://github.com/nm-l2tp/NetworkManager-l2tp/commit/906c0a4923515f967ff2555d815bb62c8745d6e0), but now think this was the completely wrong thing to do as all it did was prevent telling NetworkManager what the peer-to-peer host was (i.e. NM_VPN_PLUGIN_IP4_CONFIG_PTP). The routing fix should have been in NetworkManager code where the route shouldn't have been added if NM_VPN_PLUGIN_IP4_CONFIG_PTP and NM_VPN_PLUGIN_CONFIG_EXT_GATEWAY were the same.

Unfortunately if upstream NetworkManager incorporates a fix to not add the route if NM_VPN_PLUGIN_IP4_CONFIG_PTP and NM_VPN_PLUGIN_CONFIG_EXT_GATEWAY are the same, I will need to release a new NetworkManager-l2tp that undoes the commit that originally fixed this issue.

As a workaround, you can delete the spurious ppp0 metric 0 route in /etc/ppp/ip-up.d/01-routes.sh using the ip route del command, much like the ip route add is doing in the following example : https://wiki.archlinux.org/title/PPTP_Client#Split_Tunneling

Sorry I'm not sure I understand your :<gateway_IP_address> within nmcli question.

dkosovic commented 2 years ago

Actually, I suspect you won't be able to have an ip route del line in /etc/ppp/ip-up.d/01-routes.sh to delete the problematic route as it might be too early to do so. Sorry I don't have any suggestions

tukusejssirs commented 2 years ago

Thanks, @dkosovic, for your help! :pray:

Is --enable-libreswan-dh2 needed when I use strongswan?

The issue you are experiencing is not the same, but similar

Oops, sorry then for raising this issue here.

My problem lays in the additional route: unless I delete it, I can’t connect noly only to the VPN network, but also to the Internet (or at least the Internet connection is severly limited). If I run ip r del $vpn_ip right after bringing up the VPN connection (nmcli c up vpn), everything works as expected.

# Note: IPs were redacted.
default dev ppp0 proto static scope link metric 50
default via 192.168.1.1 dev wlan0 proto dhcp metric 60
192.168.1.0/24 dev wlan0 proto kernel scope link src 192.168.1.31 metric 60
192.168.1.1 dev wlan0 proto static scope link metric 50
1.2.3.4 dev ppp0 proto kernel scope link src 4.3.2.1  # <<<
1.2.3.4 via 192.168.1.1 dev wlan0 proto static metric 50

I think I should add a comment about my issue on NetworkManager issue tracker.

dkosovic commented 2 years ago

The --enable-libreswan-dh2 configure switch was just a general comment about issues I saw with the PKGBUILD, it along with the others have zero impact if you are using strongswan.

strongswan just work for you with networkmanager >=1.36,? You didn't need to stop loading broken strongswan plugins as described in the following ?

tukusejssirs commented 2 years ago

Well, it didn’t until this week, because this week I found somewhere on the Internet that removing the extra route (the one without explicit metric) allows me to connect to the VPN network and to the Internet.

I created a NM connection like this if it matters:

nmcli c add con-name "$con_name" type vpn vpn-type l2tp vpn.data 'gateway=1.2.3.4, ipsec-enabled=yes, ipsec-psk=p5kp4ss, password-flags=0, user=username' vpn.secrets 'password=us3rp4ss'

I didn’t configure anything related to NM, but I use EndeavourOS (Atlantis; installed in December 2021), so it could set up something. /etc/NetworkManager/NetworkManager.conf is empty and there are no files in /etc/NetworkManager/conf.d/.

Current versions I use:

networkmanager@1.36.4-1;
networkmanager-l2tp@1.20.4 (installed using the custom PKGBUILD I shared with you; however, it worked with networkmanager-l2tp@1.20.0-3, but I wanted to know if v1.20.4 would remove the requirement of the extra route deletion);
strongswan@5.9.5-1.

If you need to know anything else, I’ll provide any information you need. :wink:

dkosovic commented 2 years ago

@tukusejssirs , thanks for all that and appreciate that you added a comment upstream to the NetworkManager routing issue.

I've never used nmcli like that!

I totally forgot with the strongswan plugin issue, I had to enable the "Use this connection only for resources on its network" checkbox in the IPv4 settings (which corresponds to the ipv4.never-default setting) to reproduce the issue. So I think that explains why you weren't impacted by it.

I'll leave this issue open if others are affected by the extra route with NetworkManager >= 1.36.

tukusejssirs commented 2 years ago

@dkosovic, do you think there is a NM config not to create the extra route? Like one of the following:

ipv4.routes:                            --
ipv4.route-metric:                      -1
ipv4.route-table:                       0 (unspec)
ipv4.routing-rules:                     --
ipv4.ignore-auto-routes:                no
IP4.ROUTE[1]:                           dst = 0.0.0.0/0, nh = 0.0.0.0, mt = 50

Here is full config with IPs and passwords redacted:

VPN config

```bash connection.id: vpn_name connection.uuid: fd71468c-19c6-4276-95bb-e15b055012c3 connection.stable-id: -- connection.type: vpn connection.interface-name: -- connection.autoconnect: yes connection.autoconnect-priority: 0 connection.autoconnect-retries: -1 (default) connection.multi-connect: 0 (default) connection.auth-retries: -1 connection.timestamp: 1652266868 connection.read-only: no connection.permissions: -- connection.zone: -- connection.master: -- connection.slave-type: -- connection.autoconnect-slaves: -1 (default) connection.secondaries: -- connection.gateway-ping-timeout: 0 connection.metered: unknown connection.lldp: default connection.mdns: -1 (default) connection.llmnr: -1 (default) connection.dns-over-tls: -1 (default) connection.wait-device-timeout: -1 ipv4.method: auto ipv4.dns: -- ipv4.dns-search: -- ipv4.dns-options: -- ipv4.dns-priority: 0 ipv4.addresses: -- ipv4.gateway: -- ipv4.routes: -- ipv4.route-metric: -1 ipv4.route-table: 0 (unspec) ipv4.routing-rules: -- ipv4.ignore-auto-routes: no ipv4.ignore-auto-dns: no ipv4.dhcp-client-id: -- ipv4.dhcp-iaid: -- ipv4.dhcp-timeout: 0 (default) ipv4.dhcp-send-hostname: yes ipv4.dhcp-hostname: -- ipv4.dhcp-fqdn: -- ipv4.dhcp-hostname-flags: 0x0 (none) ipv4.never-default: no ipv4.may-fail: yes ipv4.required-timeout: -1 (default) ipv4.dad-timeout: -1 (default) ipv4.dhcp-vendor-class-identifier: -- ipv4.dhcp-reject-servers: -- ipv6.method: auto ipv6.dns: -- ipv6.dns-search: -- ipv6.dns-options: -- ipv6.dns-priority: 0 ipv6.addresses: -- ipv6.gateway: -- ipv6.routes: -- ipv6.route-metric: -1 ipv6.route-table: 0 (unspec) ipv6.routing-rules: -- ipv6.ignore-auto-routes: no ipv6.ignore-auto-dns: no ipv6.never-default: no ipv6.may-fail: yes ipv6.required-timeout: -1 (default) ipv6.ip6-privacy: -1 (unknown) ipv6.addr-gen-mode: stable-privacy ipv6.ra-timeout: 0 (default) ipv6.dhcp-duid: -- ipv6.dhcp-iaid: -- ipv6.dhcp-timeout: 0 (default) ipv6.dhcp-send-hostname: yes ipv6.dhcp-hostname: -- ipv6.dhcp-hostname-flags: 0x0 (none) ipv6.token: -- vpn.service-type: org.freedesktop.NetworkManager.l2tp vpn.user-name: -- vpn.data: gateway = 1.2.3.4, ipsec-enabled = yes, ipsec-psk = psk, password-flags = 0, user = pass vpn.secrets: vpn.persistent: no vpn.timeout: 0 proxy.method: none proxy.browser-only: no proxy.pac-url: -- proxy.pac-script: -- GENERAL.NAME: vpn_name GENERAL.UUID: fd71468c-19c6-4276-95bb-e15b055012c3 GENERAL.DEVICES: wlan0 GENERAL.IP-IFACE: wlan0 GENERAL.STATE: activated GENERAL.DEFAULT: yes GENERAL.DEFAULT6: no GENERAL.SPEC-OBJECT: /org/freedesktop/NetworkManager/ActiveConnection/1 GENERAL.VPN: yes GENERAL.DBUS-PATH: /org/freedesktop/NetworkManager/ActiveConnection/13 GENERAL.CON-PATH: /org/freedesktop/NetworkManager/Settings/8 GENERAL.ZONE: -- GENERAL.MASTER-PATH: /org/freedesktop/NetworkManager/Devices/3 IP4.ADDRESS[1]: 4.3.2.1/24 IP4.ADDRESS[2]: 4.3.2.1/24 IP4.GATEWAY: 0.0.0.0 IP4.ROUTE[1]: dst = 0.0.0.0/0, nh = 0.0.0.0, mt = 50 IP6.GATEWAY: -- VPN.TYPE: l2tp VPN.USERNAME: username VPN.GATEWAY: 1.2.3.4 VPN.BANNER: -- VPN.VPN-STATE: 5 - VPN connected VPN.CFG[1]: gateway = 1.2.3.4 VPN.CFG[2]: ipsec-enabled = yes VPN.CFG[3]: ipsec-psk = vpn VPN.CFG[4]: password-flags = 0 VPN.CFG[5]: user = pass ```

dkosovic commented 2 years ago

ipv4.ignore-auto-routes should not create that extra route, but you will probably need to add the missing routes that are normally automatically added.

In the following issue, ipv4.ignore-auto-routes got turned on with NetworkManager 1.36.0 to 1.36.2 when a connection was edited or added and it caused issues for some :

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/946#note_1350911

tukusejssirs commented 2 years ago

I have re-created the NM connection for this VPN a couple of times this week, therefore ipv4.ignore-auto-routes is not turned on for me. :man_shrugging:

As for turning it to on and to add the routes myself: from my point of view, it does not matter if I remove a single route or add another one. For me a solution would not require any changes to the routes after turning on the NM connection. It does not matter if I configure something else (NM, NM-l2tp, …).

What is the ipv4.routing-rules for could I pre-define the routes there and enable ipv4.ignore-auto-routes? Could that work this issue around?

dkosovic commented 2 years ago

Sorry didn't mean /etc/network/if-up.d, I meant add the routes in /etc/ppp/ip-up.d/01-routes.sh where you can also add conditionals case statements like the following :

#!/bin/bash

# This script is called with the following arguments:
# Arg Name
# $1 Interface name
# $2 The tty
# $3 The link speed
# $4 Local IP address for the interface
# $5 Peer IP address
# $6 Optional 'ipparam' parameter specified to pppd

case "$5" in
    172.16.244.*)
        ip route add 172.16.244.0/24 dev $1
    ;;
esac

tukusejssirs commented 2 years ago

Thanks, @dkosovic, for the clues! :pray:

It was quite easy to work this issue around, once I found the docs and tried a few configurations. ipv4.routes accepts comma-separated list of route definitions (src). For me, it was enough to provide ipv4.routes "$vpn_ip/$cidr $router_ip", as the VPN server has a static IP address and I don’t need any special config. You might want to try to list all the server IPs if it is finite list and all of them have static IP addresses.

Also note that setting ipv4.ignore-auto-routes yes didn’t do anything when I set it along with ipv4.routes (the route specified in ipv4.routes was simply added to the default routes, but with one distinction: instead of my physical ifname like wlan0 or eth0, the route added by ipv4.routes had ppp0, which didn’t work).

So, the following command works for me:

nmcli c add con-name "$con_name" type vpn vpn-type l2tp \
  vpn.data "gateway=$vpn_ip, ipsec-enabled=yes, ipsec-psk=$psk, password-flags=0, user=$user" \
  vpn.secrets "password=$pass" ipv4.routes "$vpn_ip/$cidr $router_ip" ipv4.never-default yes

Update

I rejoiced too soon. I enabled the wrong connection, therefore the command above still produces the default routes and the one with ifname set to ppp0.

$vpn_ip dev ppp0 proto kernel scope link src $remote_ip
$vpn_ip via $router_ip dev wlan0 proto static metric 50  # only this one is needed
$vpn_ip via $router_ip dev ppp0 proto static metric 50  # added by `ipv4.routes`

mergeMarc commented 2 years ago

Following the upstream issue i was able to create a workaround script that keeps that wrong route from beeing created (While still using ipv4.ignore-auto-routes no):

I created the file /etc/ppp/ip-up.d/0001routes:

#!/bin/sh

# This script is called with the following arguments:
# Arg Name
# $1 Interface name
# $2 The tty
# $3 The link speed
# $4 Local IP address for the interface
# $5 Peer IP address
# $6 Optional 'ipparam' parameter specified to pppd

logger Removing wrong vpn ip address on $1 for local $4 and peer $5.
ip addr del $4 peer $5 dev $1

It does take about 30 second after that script runs until the local ip adress on "ppp0" is up and the vpn is working in my case. I'm on Ubuntu 22.

tukusejssirs commented 2 years ago

@dkosovic, I am not sure if I should open a new issue …

Using networkmanager@1.40.0-1, networkmanager-l2tp-git@1.20.4.r1.g6d872e0-1 and strongswan@5.9.8-1 fails to connect to IPSec VPN. nmcli c up vpn outputs Error: Connection activation failed: The VPN service failed to start and journalctl contains the following logs:

Oct 21 12:12:03 hostname NetworkManager[780]: <info>  [1666347123.0495] vpn[0x55570163e980,4ecb50e8-337d-4123-9643-d98ee724343e,"vpn_charvat_bardejov"]: starting l2tp
Oct 21 12:12:13 hostname NetworkManager[780]: <warn>  [1666347133.0616] vpn[0x55570163e980,4ecb50e8-337d-4123-9643-d98ee724343e,"vpn_charvat_bardejov"]: failed to connect: 'Timeout was reached'

I have no idea what’s going on. There was no change on the server-side nor in the config. It works on MS Windows. It used to work (albeit with additional ppp0 route), now it does not work at all.

dkosovic commented 2 years ago

@tukusejssirs, I've just installed Fedora 37 in a VM, and have NetworkManager-1.40.0-1.fc37, strongswan-5.9.8-1.fc37 and used the current NetworkManager-l2tp code in this repository, so similar versions to what you are using, but I'm still able to connect.

I'm not sure what the issue you are having, might need to see more of the log output.

tukusejssirs commented 2 years ago

@tukusejssirs, I've just installed Fedora 37 in a VM

@dkosovic, I use Arch Linux with Sway if that matters.

I'm not sure what the issue you are having, might need to see more of the log output.

I can get you the logs, however, could you tell me how to access the logs you’re talking about please?

dkosovic commented 2 years ago

@tukusejssirs might be best to follow up in a new issue as I suspect the issue you have might not be related to this issue

The following will provide useful logs :

sudo journalctl --no-hostname _SYSTEMD_UNIT=NetworkManager.service + SYSLOG_IDENTIFIER=pppd

Arch Linux builds a number of experimental strongswan plugins that can be problematic, try disable loading them with:

sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/bypass-lan.conf
sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/connmark.conf
sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/forecast.conf
sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/sha3.conf

You will also need to reboot as kernel modules used by some of the strongswan plugins might also be loaded.

tukusejssirs commented 1 year ago

I have just noticed that removing a route is not required anymore. I have no idea what fixed the issue though.

Currently, when I connect to VPN, it adds the following routes (ip r):

default dev ppp0 proto static scope link metric 50
1.2.3.4 dev ppp0 proto kernel scope link src 3.4.5.6
192.168.1.1 dev wlan0 proto static scope link metric 50
2.3.4.5 via 192.168.1.1 dev wlan0 proto static metric 50

Arch Linux;
nmcli@1.40.8-2;
networkmanager@1.40.8-2;
networkmanager-l2tp-git@1.20.6.r3.g6483691-1;
networkmanager-openvpn@1.10.2-1;
strongswan@5.9.8-2.

dkosovic commented 1 year ago

I have just noticed that removing a route is not required anymore. I have no idea what fixed the issue though. ...

Arch Linux;

nmcli@1.40.8-2;

networkmanager@1.40.8-2;

networkmanager-l2tp-git@1.20.6.r3.g6483691-1;

For the benefit of others, networkmanager-l2tp-git@1.20.6.r3.g6483691-1 is referring to commit# https://github.com/nm-l2tp/NetworkManager-l2tp/commit/6483691437684d66e881ddf7d8d0325a2fb476a9 (which is now included in the recently released NetworkManager-l2tp 1.20.8). That commit reversed the routing workaround based on https://github.com/nm-l2tp/NetworkManager-l2tp/issues/132#issuecomment-606811912 in this issue (but modified slightly to conditionally not provide NetworkManager the NM_VPN_PLUGIN_IP4_CONFIG_PTP value if it was the same as the External Gateway), that workaround stopped working from NetworkManager 1.36 onwards.

@tukusejssirs are you sure you don't have a modified pppd script lying around which removes the problematic route?

I was thinking I was going to have to modify the NetworkManager source code to not add the kernel generated route to the Ext GW if NM_VPN_PLUGIN_IP4_CONFIG_PTP and NM_VPN_PLUGIN_CONFIG_EXT_GATEWAY were the same value. In particular the following NetworkManager source file :

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/blob/main/src/core/vpn/nm-vpn-connection.c

tukusejssirs commented 1 year ago

are you sure you don't have a modified pppd script lying around which removes the problematic route?

No, I didn’t create any pppd config/script nor anything related.

In December 2022, I had to remove the extra route after connecting to the VPN and before SSH-ing into the server. Yesterday I tried doing that, but I could not connect the the server via SSH (connecting to VPN works as expected). Then I tried to not remove the extra route and all works as expected.

I’ve just tried to reproduce the error on the extra route removal, but after I remove the extra route, I cannot connect to the server via SSH (ssh -vvv user@ip hangs at set_sock_tos: set socket 3 IP_TOS 0x48 and then connection times out). When I keep the extra route, all is working as expected.

dkosovic commented 1 year ago

I've posted to the upstream NetworkManager issue# 946 that this routing issue might be resolved with NetworkManager-l2tp 1.20.8 and/or NetworkManager 1.40.8. Guess we'll get some feedback there if it resolves it for others.

tukusejssirs commented 1 year ago

@dkosovic, one thing I’ve noticed that I lose my Internet connection (and sometimes ability to connect to a server reachable via the VPN) while I am connected to the VPN + connected via SSH to a server within reachable via the VPN. After I disconnect from the VPN, everything works as expected.

Moreover, it prefers wlan0 interface, even though I configured the ifname of the VPN nmcli connection to a cable network interface enp*. Actually, I have also tried the cable network interface available via a Thunderbold/USB-C dock. The result: no Internet access whatsoever.

Maybe the VPN config is the culprit. Ideally, I want to access only the servers available only via the VPN via the VPN, everything else should not go via the VPN. If that is not possible when using a single network interface (like wlan0 or enp*), it would awesome to access everything but the servers accessible via the VPN, via a different network interface.

When I change ipv4.route-metric to 200 or 80 (was 50), while wlan0 is 60 and enp* is 100, I can connect to the Internet as usually, but I cannot ping the servers available via VPN nor use ssh to connect to them. I can use ping -I ppp0 $server_ip, however, I can’t make it work with ssh. I tried to use ssh -B ppp0 $server_ip and ssh -b $ip_address_from_ip_route $server_ip, but it does not work.

Any idea how to make it work? Ideally, without providing the interface name all the time.

Thanks for any clues! :pray:

Update

I have read this forum topic/question. I can only connect to the VPN when ipv4.never-default is set to no (this was the case always). When I set it to yes, I cannot connect to a server behind a VPN via SSH. I have always ipv4.ignore-auto-routes to no.

Update 2

Just a note: right after I connect to the VPN, I can access the Internet, get a list of open ports via nmap $server_assessible_via_vpn and connect to the server via SSH, however, after some time, I lose access to the Internet, nmap reports no open ports (but the server is reachable) and I cannot connect to the server via SSH when I lose the access to the Internet. I can still ping the server behind the VPN.

dkosovic commented 1 year ago

Do you still have the 4 problematic strongswan plugins loading disabled? i.e. :

sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/bypass-lan.conf
sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/connmark.conf
sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/forecast.conf
sudo sed -i 's/load = yes/load = no/' /etc/strongswan.d/charon/sha3.conf

two of those strongswan plugins definitely affect routing only when doing split-tunnelling VPN with NetworkManager >= 1.36.

bypass-lan has known routing issues with Arch Linux and the wiki recommends not to load it :
- https://wiki.archlinux.org/title/StrongSwan#Troubleshooting
connmark module is intended to be used with xl2tpd (although guess it could be used with other L2TP daemons that support conmark), but Arch Linux doesn't apply the conmark xl2tpd patch, so not sure why the experimental connmark module is loaded by default :
- https://github.com/xelerance/xl2tpd/issues/82
sha3 hasn't been standardized for IKE (phase 1 algorithms) or ESP (phase 2 algorithms). Other Linux distributions don't load it.
- https://wiki.strongswan.org/issues/3421
forecast, definitely affects routing.

On Arch Linux if I didn't disable those strongswan plugins when split tunnelling VPN was used, I would have a stable VPN connection for a number of minutes until until DHCP brought down the parent wired network.

I did an Arch Linux bug report last year for strongswan:

https://bugs.archlinux.org/task/74553

In that bug report I recommended the package() section of the strongswan PKGBUILD file contain the following so that the loading of the 4 problematic strongswan plugins is disabled:

# do not load certain plugins by default that are known to have problems
for p in bypass-lan connmark forecast sha3; do
sed -i 's/load = yes/load = no/' "${pkgdir}/etc/strongswan.d/charon/${p}.conf"
done

Guess it was never applied as no Arch Linux users confirmed it fixed their routing problem.

Non Arch Linux specific, last week I helped someone with L2TP/IPsec issues who eventually got split tunnelling to work by routing the internal subnet through ppp0 and 0.0.0.0/0 through wlo1, see:

https://forums.linuxmint.com/viewtopic.php?f=52&t=388466

He mentions he had to disable IPv6 and change his client router subnet from 192.168.2.0/8 to something else.

tukusejssirs commented 1 year ago

Do you still have the 4 problematic strongswan plugins loading disabled?

No, I disabled then some months ago, based on your suggestion. Since then I have them disabled. I have just double-checked it.

https://forums.linuxmint.com/viewtopic.php?f=52&t=388466

I’ve read all from top to bottom.

My notes:

I have disabled IPv6 by setting ipv6.method to disabled in the nmcli VPN connection, but it didn’t help anything.
- I tried this with ipv4.never-default set to yes, but while I could connect to the VPN, I could not ping anything behind the VPN nor SSH to any server. I had access to the Internet though.
- I want to test rebooting my machine and even disable IPv6 in kernel using sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1.
My VPN works as expected on Android (via built-in VPN manager), MS Windows 10+ and macOS (unknown version, current latest or previous) without issues. I haven’t compared the VPN settings of other platforms to those of Linux though.
As I understand it, changing local router subnet to anything but 192.168.2.0/8 means that the local subnet should be different from the remote subnet (behind VPN), which in my case is already different. That said, I dislike this requirement, however, I have not enough knowledge to tell if it is a hard requirement or not from networking/IPsec point of view.
I tested open ports using nmap, which says that the open ports for 1-1.5 minutes, then the ports are (seemingly) closed, then every a few seconds nmap returns some of the actually open ports, but the list is not always the same (sometimes there are more, sometimes there are less ports open; out of 11 ports actually open, 1 or 2 or 4 ports were open; it is very random; more often there are all ports closed).
- If I connect to a server behind the VPN before nmap starts reporting nonsense (i.e. in the first few minutes), I can keep that SSH connection up. However, if I want to connect to the server after nmap reports nonsense, I can’t make the connection (I need to restart the VPN connection).
- I think that when nmap starts reporting nonsense, I lose the access to the Internet.

dkosovic commented 1 year ago

Sorry I don't have any real suggestions.

I notice there are multiple L2TP kernel patches for the current kernel to fix race conditions:

https://lore.kernel.org/netdev/989d0a69c3f6c571e6bfc234a744d0183c4a269a.camel@redhat.com/T/

Maybe try a different kernel.

tukusejssirs commented 1 year ago

Thanks, @dkosovic, for trying to help me! :pray:

Okay, I tried to boot Fedora Workstation 37 Live from USB (not in VM, but directly on my laptop) and test if the VPN connection is better on Fedora than on Arch Linux. Note: I am okay to modify the default Arch Linux kernel using DKMS and kernel parameters/flags, but I don’t really want to use different kernel. Thus if it works on Fedora, I might reconsider switching back to Fedora.

However, it fails to connect. No connection is created, which is worse situation than on Arch Linux. Note that I haven’t installed Fedora on the drive, I just tested it in the live mode. I might install it somewhere and test it out that way, however, will it change anything? :thinking:

# Install dependencies
sudo dnf -y install xl2tpd strongswan NetworkManager NetworkManager-l2tp NetworkManager-l2tp-gnome

# Restart Network Manager service
sudo systemctl restart --now NetworkManager

# I check if `strongswan` service is disabled (it was)
systemctl status strongswan

# Add NM connection
nmcli c add con-name "$con_name" type vpn vpn-type l2tp \
  vpn.data "gateway=$vpn_ip, ipsec-enabled=yes, ipsec-psk=$psk, password-flags=0, user=$user" vpn.secrets "password=$pass"

# Connect to the VPN, which fails to connect (see `journalctl.log` at the end of this comment)
nmcli c up "$con_name"

# Disable IPv6 in the NM connection
nmcli c mod "$con_name" ipv6.method disabled

# Connect to the VPN, which fails to connect (same error as before disabling IPv6)
nmcli c up "$con_name"

journalctl.log

tukusejssirs commented 1 year ago

As I need to make this work ASAP, I have work the bug around using VPN Hotspot app (found via this website which have another, non-root solution) which requires rooted Android phone (which I have) and it is also available on F-Droid.

I simply needed to:

connect to the VPN on the phone;
create a hotspot (via Android settings or via VPN Hotspot);
enable Wi-Fi hotspot (in VPN Hotspot app);
also enable swlan0 in VPN Hotspot:
- don’t forget to do this, otherwise I could not ping/nmap/connect to servers behind the VPN;
- also, it might take a while (a few seconds) to be able to connect to the remote servers after swlan0 is enabled.

Basically, I use my phone as a proxy, which could be done differently, however, we all have a phone at a hand, so it is a quite practical solution from my point of view.

Nevertheless, I am still looking for a solution to make VPN connection fully working and stable on Arch Linux.

dkosovic commented 1 year ago

On Fedora libreswan and strongswan can be installed at the same time unlike other Linux distros, if both are installed, Networkmanager-l2tp defaults to libreswan. Your logs indicate you are using libreswan (i.e., pluto daemon instead of strongswan's charon daemon).

libreswan no longer supports the really weak DH2 modp1024 algorithm (except if rebuilt with the DH2 switch), for the main mode (phase 1) and quick mode (phase2) proposals. I suspect your VPN server is using modp1024 and it isn't proposing any stronger algorithms, so you get the no proposal chosen error.

As you have strongswan already installed, you can switch to strongswan by removing libreswan with:

sudo rpm -e libreswan

On Fedora I also recommend unblacklisting the L2TP kernel modules, see:

https://github.com/nm-l2tp/NetworkManager-l2tp#issue-with-blacklisting-of-l2tp-kernel-modules

which says to do :

sudo sed -e '/blacklist l2tp_netlink/s/^b/#b/g' -i /etc/modprobe.d/l2tp_netlink-blacklist.conf
sudo sed -e '/blacklist l2tp_ppp/s/^b/#b/g' -i /etc/modprobe.d/l2tp_ppp-blacklist.conf

tukusejssirs commented 1 year ago

Thanks, @dkosovic! Some notes on my new test on live Fedora 37:

I failed to remove libreswan (I tried to do so with dnf and rpm): it is not installed. It is good to know that pluto is used by libreswan and charon by strongswan.
It seems like all I had to do is to unblacklist l2tp_netlink and l2tp_ppp and now it seems to work. There where not blacklisted in Arch Linux (actually there is nothing blacklisted, while on Fedora there are also l2tp_ip-blacklist.conf and l2tp_eth-blacklist.conf with l2tp_ip and l2tp_eth respectively; are these needed? what is there purpose?).
I didn’t have to disable IPv6, although that doesn’t hurt as I use only IPv4 (both locally and remotely).
Internet access seems to works as expected, although in split network (ipv4.never-default set to no) doubles the ping response for me (on VDSL, I get ping response around 30 ms when not connected to the VPN, 50-80 ms on split network and 150-200 ms on merged network). Moreover, on merged network I lose some packets (25-35 %).
I have search for any config related to L2TP on Fedora and I have found out that the following to config files are missing on Arch Linux. Are they needed? Should create them?
- /etc/xl2tpd/xl2tpd.conf:
  - On Arch Linux, there is no such file, but there are some example configs for client and server.
    
    xl2tpd.conf
```ini ; This is a minimal sample xl2tpd configuration file for use ; with L2TP over IPsec. ; ; The idea is to provide an L2TP daemon to which remote Windows L2TP/IPsec ; clients connect. In this example, the internal (protected) network ; is 192.168.1.0/24. A special IP range within this network is reserved ; for the remote clients: 192.168.1.128/25 ; (i.e. 192.168.1.128 ... 192.168.1.254) ; ; The listen-addr parameter can be used if you want to bind the L2TP daemon ; to a specific IP address instead of to all interfaces. For instance, ; you could bind it to the interface of the internal LAN (e.g. 192.168.1.98 ; in the example below). Yet another IP address (local ip, e.g. 192.168.1.99) ; will be used by xl2tpd as its address on pppX interfaces. [global] ; listen-addr = 192.168.1.98 ; ; requires openswan-2.5.18 or higher - Also does not yet work in combination ; with kernel mode l2tp as present in linux 2.6.23+ ; ipsec saref = yes ; Use refinfo of 22 if using an SAref kernel patch based on openswan 2.6.35 or ; when using any of the SAref kernel patches for kernels up to 2.6.35. ; saref refinfo = 30 ; ; force userspace = yes ; ; debug tunnel = yes [lns default] ip range = 192.168.1.128-192.168.1.254 local ip = 192.168.1.99 require chap = yes refuse pap = yes require authentication = yes name = LinuxVPNserver ppp debug = yes pppoptfile = /etc/ppp/options.xl2tpd length bit = yes ```
- /etc/ppp/options.xl2tpd:
  - On Arch Linux, there is no such file.
    
    options.xl2tpd
```bash ipcp-accept-local ipcp-accept-remote ms-dns 8.8.8.8 ms-dns 1.1.1.1 # ms-dns 192.168.1.1 # ms-dns 192.168.1.3 # ms-wins 192.168.1.2 # ms-wins 192.168.1.4 noccp auth #obsolete: crtscts idle 1800 mtu 1410 mru 1410 nodefaultroute debug #obsolete: lock proxyarp connect-delay 5000 # To allow authentication against a Windows domain EXAMPLE, and require the # user to be in a group "VPN Users". Requires the samba-winbind package # require-mschap-v2 # plugin winbind.so # ntlm_auth-helper '/usr/bin/ntlm_auth --helper-protocol=ntlm-server-1 --require-membership-of="EXAMPLE\\VPN Users"' # You need to join the domain on the server, for example using samba: # http://rootmanager.com/ubuntu-ipsec-l2tp-windows-domain-auth/setting-up-openswan-xl2tpd-with-native-windows-clients-lucid.html ```

New installation steps that work on Fedora 37 (without issue) and Arch Linux (with the issue that I cannot nmap the servers behind the VPN after some time, like after four minutes).

Steps that work for me

```bash # Install dependencies sudo dnf -y install xl2tpd strongswan NetworkManager NetworkManager-l2tp NetworkManager-l2tp-gnome # Remove `libreswan` sudo dnf -y remove libreswan # sudo rpm -e libreswan # Unblacklist the L2TP kernel modules # Permanently sudo sed -e '/blacklist l2tp_netlink/s/^b/#b/g' -i /etc/modprobe.d/l2tp_netlink-blacklist.conf sudo sed -e '/blacklist l2tp_ppp/s/^b/#b/g' -i /etc/modprobe.d/l2tp_ppp-blacklist.conf # Temporarily sudo modprobe l2tp_ppp # Restart Network Manager service sudo systemctl restart --now NetworkManager # I check if `strongswan` service is disabled (it was) systemctl status strongswan # Add NM connection # 'merged' network nmcli c add con-name "$con_name" type vpn vpn-type l2tp vpn.data "gateway=$vpn_ip, ipsec-enabled=yes, ipsec-psk=$psk, password-flags=0, user=$user" vpn.secrets "password=$pass" ipv6.method disabled # 'split' network nmcli c add con-name "$con_name" type vpn vpn-type l2tp vpn.data "gateway=$vpn_ip, ipsec-enabled=yes, ipsec-psk=$psk, password-flags=0, user=$user" vpn.secrets "password=$pass" ipv6.method disabled ipv4.never-default no ipv4.ignore-auto-routes no # Connect to the VPN, which fails to connect (see `journalctl.log`) nmcli c up "$con_name" ```

dkosovic commented 1 year ago

All kernel modules that have been moved to the kernel-modules-extra package (not just L2TP ones) are blacklisted on Fedora, Red Hat Enterprise Linux and derivatives. As far as I'm aware, no other Linux distros do the L2TP Blacklisting.

I only provided unblacklisting L2TP kernel module instructions that were enough for xl2tpd and kl2tpd. Some of the other L2TP kernel modules are for L2TPv3 which xl2tpd and kl2tpd don't implement.

On Fedora xl2tpd can be started as a systemd service and those files are used with that service. Those xl2tpd files are neither required or used with NetworkManager-l2tp which starts its own instance of xl2tpd and points it to its own xl2tpd config files that are generated on the fly.

You might like to reduce the MTU/MRU value in the PPP settings, it might help with the packet loss. The default is 1400, so try 1300, 1200, etc or whatever the VPN server was configured to use if you know. Sorry I've never tried comparing ping times, but suspect MTU/MRU value might affect it.

I'm going to close this issue as someone else has confirmed the routing issue has been fixed :

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/946#note_1725575

@tukusejssirs if you have any other issue, open a new issue. Also, thanks for being the first to point out this routing issue was fixed with the new NetworkManager and NetworkManager-l2tp packages.

tukusejssirs commented 1 year ago

You might like to reduce the MTU/MRU value in the PPP settings, it might help with the packet loss. The default is 1400, so try 1300, 1200, etc or whatever the VPN server was configured to use if you know. Sorry I've never tried comparing ping times, but suspect MTU/MRU value might affect it.

I’ll try this out.

I'm going to close this issue as someone else has confirmed the routing issue has been fixed :

Yeah, I’ve seen it. :wink:

@tukusejssirs if you have any other issue, open a new issue. Also, thanks for being the first to point out this routing issue was fixed with the new NetworkManager and NetworkManager-l2tp packages.

Fair enough. Sorry to start talking about other my issues not related to this issue, however, at first I thought it might be related.

Just a quick question: do you think I should open a new issue for losing connection (sort of) on Arch Linux?

Thank you very much for all your help and time! :pray:

dkosovic commented 1 year ago

Just a quick question: do you think I should open a new issue for losing connection (sort of) on Arch Linux?

Please do if you want to. Although I might not have the answer, I’m sure other Arch Linux users will find it useful. If it is a L2TP kernel issue, they tend to hit Arch Linux first, followed by the latest Fedora version. Hopefully it is just a MTU/MRU issue. You might also be able to work out the optimum MTU size by analysing the ping output. Do a Google search and you’ll find lots of examples.

nm-l2tp / NetworkManager-l2tp

Something adds a ppp0 route to the gateway making the connection fail #132