OpenVPN / openvpn

OpenVPN is an open source VPN daemon
http://openvpn.net
Other
11.07k stars 3.02k forks source link

OpenVPN signals to systemd that it is ready before the VPN connection is up #617

Open me-and opened 1 month ago

me-and commented 1 month ago

Describe the bug OpenVPN sends the READY=1 sd_notify message to systemd once it has completed initialisation, before the VPN tunnel is established. This means that systemd units that depend on the VPN connection being up by ordering themselves after the OpenVPN unit will fail.

To Reproduce Broad outline:

  1. Create a systemd unit for an OpenVPN connection
  2. Create a systemd unit that can only succeed once the OpenVPN connection is up, and order it after the OpenVPN unit with Requires= and After=
  3. Start the second unit. Note that it fails.

My specific scenario has CIFS mount and automount units; the mount will only succeed if the VPN has connected. The trimmed-for-simplicity unit config looks like this:

/etc/systemd/system/openvpn-pdnet.service:

[Unit]
After=network.target
Description=OpenVPN instance ‘pdnet’

[Service]
ExecStart=openvpn --suppress-timestamps --config <file>
Restart=always
Type=notify

/etc/fstab:

//example.org/Directory /mnt cifs rw,credentials=/etc/creds,x-systemd.automount,x-systemd.mount-timeout=60s,nofail,x-systemd.requires=openvpn-pdnet.service,x-systemd.after=openvpn-pdnet.service 0 0

Expected behavior OpenVPN does not signal to systemd that it is ready until the VPN connection is actually up, so that systemd units configured to only start after the OpenVPN unit can safely rely on the connection being available.

Version information:

Additional context The current behaviour is deliberate, and was introduced by e83a8684f0a0d944e9d53cdad2b543cfd1b6fbae in 2017; before that commit the behaviour I'd like was in place. Given this is deliberate behaviour, I'm opening this report to discuss whether we want a fix; the fix itself is pretty straightforward assuming there's consensus that this is actually a bug.

There are three justifications given in the commit message for that change:

As I say, I definitely don't understand the ramifications of at least two of the problems outlined above, and I could well believe I don't fully appreciate the third either. Nonetheless, being able to order one systemd unit after another is exactly what the READY=1 notification is intended for, and this functionality is broken with the current OpenVPN code. At the very least, I'd like the behaviour to be configurable.

I have a patch more-or-less ready to go for what I consider to be the preferable behaviour, and I'm currently running OpenVPN with that patch in place.

me-and commented 1 month ago

The patch I'm currently using is at https://github.com/me-and/nixcfg/blob/d53a799b29de3a80889db1c13b391fef4d422113/overlays/openvpn/openvpn.diff, although I expect it'll want a small amount of tidying for style.

cron2 commented 1 month ago

What does the patch do on openvpn server instances (where "waiting for an established VPN session" might never happen)?

me-and commented 1 month ago

Ah! Good question; I hadn't realised this code was used for server initialisation as well as client initialisation.

In the server scenario, I'd expect OpenVPN to emit the "ready" signal as soon as it can accept incoming VPN connections, which probably means it'll need to emit the signal in different places depending on whether it's acting as a client or server.

I also now suddenly have a much clearer idea of what the p2p use case is. I imagine that mode is the clearest use case for making this configurable, as I can absolutely see people wanting to rely on both OpenVPN being available for connections and for OpenVPN to be fully connected.

cron2 commented 1 month ago

Introducing a new config option is the traditional way we solve things, and now we have hundreds of options that nobody can remember why they were introduced :-) - so this needs good consideration.

Your use case ("client, outbound connection, must wait") and the server case ("signal systemd as soon as incoming connections can be accepted") could be distinguished by looking at the --client option. But... other users might be happy with OpenVPN just starting and succeeding "eventually", not having something wait for it... would that interfere with anything?

p2p mode comes in two kinds - one is the old static key mode, which basically does not negotiate anything (so, "READY" is reached "when everything is ready"). Then, there's TLS p2p mode, which does negotiate, and has a tls-client and tls-server role... so these could again be differentiated by config.

OTOH, maybe we do want an option... pinging @flichtenheld to get more brains.

flichtenheld commented 1 month ago

Note that is is very helpful to also go to the Trac tickets referenced in the commit (https://community.openvpn.net/openvpn/ticket/827, https://community.openvpn.net/openvpn/ticket/801). They show much more about the motivation behind the patch and @dsommers there acknowledges the problems that are raised in this issue. So as you said, this is definitely expected behavior and not an accidental regression. So just reverting this patch will probably be not the correct solution.

Ultimately I guess there are too many use-cases to satisfy everyone by the default behavior. A --client with stable internet but complex dependencies might want different behavior than a --client on the road with intermittent connections. And a --server might want to have different behavior again.

So far I saw two suggestions to make the behavior more flexible:

  1. As suggested above, add a config option (default might depend on --client/--server/etc.). Relatively straightforward, one more option...
  2. In 827 it was suggested that a possible solution might be an additional systemd service that could monitor openvpn and be used to report an additional status. That would probably need to talk to openvpn on the management interface to get a accurate status. This might be relatively complex.
me-and commented 1 month ago

@flichtenheld Thank you! I hadn't known how/where to find the Trac tickets, but that makes a lot of sense.

Having an additional service that could provide additional state information is the approach taken by several other parts of systemd, for example network.target vs network-online.target, or time-set.target vs time-sync.target. But, as you say, that seems like it'll add a lot of complexity, and would only be more useful than the config option approach if someone needed to use the same configuration but have different operations ordered after both the initial setup being completed and after the VPN was fully established.

My inclination would be config options. Something like sd-notify on-init vs sd-notify on-connect, with on-connect being the default behaviour for clients, and on-init being the current behaviour and the default otherwise.