Open me-and opened 1 month ago
The patch I'm currently using is at https://github.com/me-and/nixcfg/blob/d53a799b29de3a80889db1c13b391fef4d422113/overlays/openvpn/openvpn.diff, although I expect it'll want a small amount of tidying for style.
What does the patch do on openvpn server instances (where "waiting for an established VPN session" might never happen)?
Ah! Good question; I hadn't realised this code was used for server initialisation as well as client initialisation.
In the server scenario, I'd expect OpenVPN to emit the "ready" signal as soon as it can accept incoming VPN connections, which probably means it'll need to emit the signal in different places depending on whether it's acting as a client or server.
I also now suddenly have a much clearer idea of what the p2p use case is. I imagine that mode is the clearest use case for making this configurable, as I can absolutely see people wanting to rely on both OpenVPN being available for connections and for OpenVPN to be fully connected.
Introducing a new config option is the traditional way we solve things, and now we have hundreds of options that nobody can remember why they were introduced :-) - so this needs good consideration.
Your use case ("client, outbound connection, must wait") and the server case ("signal systemd as soon as incoming connections can be accepted") could be distinguished by looking at the --client
option. But... other users might be happy with OpenVPN just starting and succeeding "eventually", not having something wait for it... would that interfere with anything?
p2p
mode comes in two kinds - one is the old static key mode, which basically does not negotiate anything (so, "READY" is reached "when everything is ready"). Then, there's TLS p2p mode
, which does negotiate, and has a tls-client
and tls-server
role... so these could again be differentiated by config.
OTOH, maybe we do want an option... pinging @flichtenheld to get more brains.
Note that is is very helpful to also go to the Trac tickets referenced in the commit (https://community.openvpn.net/openvpn/ticket/827, https://community.openvpn.net/openvpn/ticket/801). They show much more about the motivation behind the patch and @dsommers there acknowledges the problems that are raised in this issue. So as you said, this is definitely expected behavior and not an accidental regression. So just reverting this patch will probably be not the correct solution.
Ultimately I guess there are too many use-cases to satisfy everyone by the default behavior. A --client with stable internet but complex dependencies might want different behavior than a --client on the road with intermittent connections. And a --server might want to have different behavior again.
So far I saw two suggestions to make the behavior more flexible:
@flichtenheld Thank you! I hadn't known how/where to find the Trac tickets, but that makes a lot of sense.
Having an additional service that could provide additional state information is the approach taken by several other parts of systemd, for example network.target
vs network-online.target
, or time-set.target
vs time-sync.target
. But, as you say, that seems like it'll add a lot of complexity, and would only be more useful than the config option approach if someone needed to use the same configuration but have different operations ordered after both the initial setup being completed and after the VPN was fully established.
My inclination would be config options. Something like sd-notify on-init
vs sd-notify on-connect
, with on-connect
being the default behaviour for clients, and on-init
being the current behaviour and the default otherwise.
Describe the bug OpenVPN sends the
READY=1
sd_notify message to systemd once it has completed initialisation, before the VPN tunnel is established. This means that systemd units that depend on the VPN connection being up by ordering themselves after the OpenVPN unit will fail.To Reproduce Broad outline:
Requires=
andAfter=
My specific scenario has CIFS mount and automount units; the mount will only succeed if the VPN has connected. The trimmed-for-simplicity unit config looks like this:
/etc/systemd/system/openvpn-pdnet.service
:/etc/fstab
:Expected behavior OpenVPN does not signal to systemd that it is ready until the VPN connection is actually up, so that systemd units configured to only start after the OpenVPN unit can safely rely on the connection being available.
Version information:
Additional context The current behaviour is deliberate, and was introduced by e83a8684f0a0d944e9d53cdad2b543cfd1b6fbae in 2017; before that commit the behaviour I'd like was in place. Given this is deliberate behaviour, I'm opening this report to discuss whether we want a fix; the fix itself is pretty straightforward assuming there's consensus that this is actually a bug.
There are three justifications given in the commit message for that change:
"First, it adds challenges if --chroot is used in the configuration; this is already fixed."
I don't use
--chroot
so I'm not quite sure what the challenges are, but if they've already been fixed then they presumably don't need this fix as well?"Secondly, it will cause havoc on static key p2p mode configurations where the log line above ["Initialization Sequence Completed"] will not happen before either sides have completed establishing a connection"
I also don't use these configurations, and I don't have a sense of what havoc this causes. But it seems correct to me that the OpenVPN client would not report itself as ready to systemd before the connection has been established.
"And thirdly, if a client configuration fails to establish a connection within 90 seconds, it will also fail. For the third case this may not be a critical issue itself, as the host just needs to get an Internet access established first - which in some scenarios may take much longer than those 90 seconds systemd grants after the OpenVPN client configuration is started."
I do understand this issue, but I think this is the wrong solution. Or at least it's the wrong solution now; it may well be that systemd didn't have better options when this change was made in 2017. The point of reporting that a unit is ready is that systemd knows the unit can be used and other units that depend on it can run. If it's taking a long time for OpenVPN to fully establish a connection, it's correct that systemd should know and be able to act on that situation.
If this is causing problems, the correct solutions IMVHO are either (a) setting a longer timeout before systemd declares the unit has failed, using
TimeoutStartSec=
, or (b) where there is some other blocking requirement like Internet access, to order the OpenVPN unit after that access has been established, e.g. by specifyingWants=network-online.target
andAfter=network-online.target
.As I say, I definitely don't understand the ramifications of at least two of the problems outlined above, and I could well believe I don't fully appreciate the third either. Nonetheless, being able to order one systemd unit after another is exactly what the
READY=1
notification is intended for, and this functionality is broken with the current OpenVPN code. At the very least, I'd like the behaviour to be configurable.I have a patch more-or-less ready to go for what I consider to be the preferable behaviour, and I'm currently running OpenVPN with that patch in place.