Running OpenVPN 3 Linux inside a container

dsommers commented 2 years ago

This is just a tracking ticket, split out of issue #50 , to track the possibility and requirements needed to run OpenVPN 3 Linux inside a container.

There are several challenges here, depending on how high we set the bar in regards to isolation and privilege access. Most of the openvpn3-service-* process runs with basically no privileges. The exception is openvpn3-service-netcfg (aka netcfg - net config).

The netcfg service requires privileges to change the network configuration (adding/removing virtual interfaces, configuring IP addresses, routing - requires CAP_NET_ADMIN). In addition netcfg will also either need file access to read and manipulate /etc/resolv.conf or interact with systemd-resolved over D-Bus. Manipulating resolv.conf adds CAP_DAC_OVERRIDE. If it is attempted to use --redirect-method bind-device, CAP_NET_RAW is also required.

All of these capabilities above will currently require the container to be given more privileges as well. As a first step, it might be acceptable to give fairly broad capabilities and privileges on the system, but ideally this should be restricted as much as possible to stay inside the container only.

RafayAK commented 2 years ago

I have created an openvpn-connector docker container for easy deployment of OpenVPN Cloud Connectors. I could perhaps help in this.

https://github.com/RafayAK/openvpn-cloud-connector

dsommers commented 2 years ago

To be honest, I don't see the point of using containers when they must run with --privileged. That means the whole container has elevated privileges.

You get at least the same level of security running OpenVPN 3 Linux outside the container out of the box, and in particular if you're on a system with SELinux.

A Dockerized solution for OpenVPN makes only sense when the container can run unprivileged and still be able to establish a connection and setup the networking accordingly. Clearly, there need to be some parts running with CAP_NETADMIN privileges - and this is the crux of this challenge. Where can and how can the openvpn3-service-netcfg service run with the proper privileges and still be available from the inside of unprivileged containers.

RafayAK commented 2 years ago

hmm? I think I can run my container unprivileged. Can't really recall why I made it --privileged.

Also, my use-case (deployment at client locations) benefits from a containerized OpenVPN stack, which makes stuff a whole lot simpler.

RafayAK commented 2 years ago

I'll get back to you after I deploy an unprivileged container on a test machine.

dsommers commented 2 years ago

You need --privileged because that is required to create/destroy TUN/ovpn-dco devices, configure IP addresses and routing.

RafayAK commented 2 years ago

Yeah, but I'm doing nothing on the host, I haven't passed --net=host, running ifconfig doesn't show any TUNs. Basically all the requests to the container are forwarded from the container to the host over the docker0 interface.

RafayAK commented 2 years ago

You're right my implementation will not work without the --privileged argument, just tested. This ubuntu-systemd container I'm using has this as a prerequisite.

RaphMad commented 2 years ago

I created an alpine/musl based docker container https://github.com/RaphMad/openvpn3_linux_docker, along with a sample docker-compose.yml that shows its basic usage.

Needing to run as root and requiring privileged: true are the biggest letdowns at the moment, and I think that most could be overcome by using capabilities.

Some observations:

There already seems some capability handling in place for the netcfg service: https://github.com/OpenVPN/openvpn3-linux/blob/master/src/netcfg/openvpn3-service-netcfg.cpp#L88
There is a hardcoded requirement to run as root for netcfg: https://github.com/OpenVPN/openvpn3-linux/blob/master/src/netcfg/openvpn3-service-netcfg.cpp#L119
The dbus service for netcfg wants to switch to the root user: https://github.com/OpenVPN/openvpn3-linux/blob/master/src/service-autostart/net.openvpn.v3.netcfg.service.in

In theory, having the CAP_NET_ADMIN permission on the netcfg executable as well as starting the container with cap_add: - NET_ADMIN should be enough to create tun devices without --privileged / root.

What stops me from progressing atm is that removing privileged and even adding cap_add: - ALL will result in permissions errors when attempting to create the tun device (I created a similar container for wireguard, and wg devices have no problem being created by a setcapped executable with NET_ADMIN, even when run as non-root and without --privileged).

A second pain point is dbus itself - the current services seem very tailored to being run on a privileged system dbus, which is hard to accomplish within an unprivileged container, potentially as non-root. I'm simply lacking dbus knowledge here, but I think again that it should be possible to run it containerized without the need for broad permissions.

dsommers commented 2 years ago

In theory, it should be possible to check if openvpn3-service-netcfg has the CAP_NET_ADMIN privilege (and possibly a few more, depending on a few of the options) before dropping root privileges. I have not spent time on that yet, but it's something far down on my list (unless someone else takes a dive at it).

AFAIK, you can't tell the D-Bus daemon to auto-start a process with certain privileges. But it is possible to start openvpn3-service-netcfg manually and adding the --idle-exit 0 argument; that ensures it will never stop running automatically if being idle. If this process can be started as non-root with the proper privileges, this could work with the intended fix above.

Now, I'm not quite yet convinced that alone will be enough. And the container anyhow runs with somewhat elevated privileges as well. I want to avoid even that. So I'm wondering if there's a possibility to have the netcfg service running on the outside of the container, while being accessed from inside the container. This way, the container can run entirely unprivileged. But it needs a bit more tweaks too, as netcfg and the docker/podman network stack needs to beware of each other, to pass the traffic correctly in the end.

RaphMad commented 2 years ago

Wouldn't running netcfg on the host always create the tun adapter on the host, which is a quite different use-case than providing a tunnel within a namespaced network environment only to other containers?

A "part inside / part outside container" solution seems like it would defeat some core containerization principles, like host-independent deployments etc.

I think if the goal is getting VPN access on a host machine, a containerized solution gives little benefit in general because the "holes to punch" would be pretty substantial and in a way equivalent to each other with any solution (either joining the host network and getting privileges/capabilities to modify the host, or mounting the hosts dbus socket and weakening apparmor rules so they allow to communicate with a privileged netcfg host service).

RaphMad commented 2 years ago

One of the weirder things when trying to minimize capabilities for my docker container is that privileged: true works, but removing it and setting all of apparmor:unconfined, seccomp:unconfined and cap_add: -ALL still results in an error:

dbus-daemon[26]: [system] Successfully activated service 'net.openvpn.v3.netcfg'
Session path: /net/openvpn/v3/sessions/b35c24f4s3954s412as8c44s0a3b29a1e582
terminate called without an active exception
session-start: ** ERROR ** Failed to start session

This seems at odds with most documentation I could find (Link1, Link2, which usually seems to claim that apparmor:unconfined, seccomp:unconfined and cap_add: -ALL and privileged: true are equivalent.

Without that starting point its kind of hard even beginning to trim down permissions...

dsommers commented 2 years ago

I see your point, @RaphMad. But in the moment you start doing networking with containers, the container needs access to a network outside the container itself. Network namespaces helps separating them from each other, and allows containers to both have their own independent cross-container network as well as accessing "the world" from within the container.

My take is that netcfg need to be allowed to do some limited network operations to be able to do the proper VPN configuration, regardless of it is inside or outside a container. At the same time, a core goal of OpenVPN 3 Linux is to run and operate with as few privileges as possible. If that goal means that the whole container needs to be given the same privileges as netcfg requires to operate, we've lost parts of the goal.

So netcfg and the container management need to beware of each other, so that netcfg can create and configure a VPN inside a container while not requiring the container to have any elevated privileges. It's a high and challenging goal. But I refuse to accept it as impossible at this point :slightly_smiling_face: And this might mean that netcfg need to learn how to configure a VPN network inside a specific namespace.

dsommers commented 2 years ago

This seems at odds with most documentation I could find (Link1, Link2, which usually seems to claim that apparmor:unconfined, seccomp:unconfined and cap_add: -ALL and privileged: true are equivalent.

Without that starting point its kind of hard even beginning to trim down permissions...

I don't know much about apparmor (I'm more of an SELinux person). If we start simple with openvpn3-service-netcfg, using the default settings without any --resolv-conf setting and not touching --redirect-method .... all it needs is CAP_NET_ADMIN. That's what is required to create/destroy interfaces, configure IP addresses and setup routes.

RaphMad commented 2 years ago

I understand your points, and already got some experience with mounting the dbus socket into containers.

The challenge for that use-case would be to create an apparmor profile thats not completely unconfined, but would just allow communication with netcfg (ideally even down to the messages exchanged).

Being able to speciy the network namespace would be a great feature, the only thing I'm debating is the that solution not being host-agnostic (swarm, kubernetes, etc...).

OpenVPN / openvpn3-linux

Running OpenVPN 3 Linux inside a container #86