Closed hairyhenderson closed 3 months ago
Not sure if this works or has the desired effect. Regular container ports are proxied by docker. So caddy would only be able to manipulate the one within the container?
(unless host networking is used)
@bt90 this was an experiment that probably won't work as-is (see the failing build)
Regular container ports are proxied by docker. So caddy would only be able to manipulate the one within the container?
Right, but ultimately they still require CAP_NET_ADMIN to make certain changes.
I'm not super clear on the reason behind this... it has something to do with improved UDP performance with the QUIC implementation, though it can work without it. From a discussion @francislavoie and I were having yesterday we'll probably just have to document that users should use --cap-add=net_admin
(or whatever) if they want better performance with QUIC 🤔
we'll probably just have to document that users should use
--cap-add=net_admin
(or whatever) if they want better performance with QUIC 🤔
That in combination with --network host
should yield the best performance.
That in combination with
--network host
Probably should be cautious?
CAP_NET_ADMIN
+ --network host
Q1. Can an attacker gain root on my host OS using only the NET_ADMIN capability? Yes (in some cases).
A local user (with CAP_NET_ADMIN capability) could use this flaw to crash the system or potentially escalate their privileges on the system.
CAP_NET_ADMIN allows the capability holder to modify the exposed network namespaces' firewall, routing tables, socket permissions, network interface configuration and other related settings on exposed network interfaces. It should be noted several privilege escalation vulnerabilities and other historical weaknesses have resulted from the ability to leverage this capability.
Support for --network host
varies by platform too IIRC. On Windows within WSL2, while you can run a container with this option, you cannot successfully bind ports that can be reached from other processes on the WSL2 host (I think this is due to the Docker daemon running via a separate VM instance in WSL2).
Regular container ports are proxied by docker.
IIRC this depends on network mode (not just host), and daemon configuration.
user-proxy: true
for some time, but that can be disabled which changes the networking a bit (somewhat inconsistent when iptables is managed and kernel features used to support localhost routing).user-proxy
and eventually remove the proxy in future though, along with a new replacement for managing the network instead of iptables (which fights with UFW/firewalld). This also differs on other platforms than linux, especially via the desktop GUI implementation Docker Desktop (which uses a VM even on linux IIRC).So caddy would only be able to manipulate the one within the container
From within it's network namespace I think? The first link I provided above notes that from kernel 5.0 a feature available via the capability was relaxed to be usable outside of the root
network namespace.
From a discussion @francislavoie and I were having yesterday we'll probably just have to document that users should use
--cap-add=net_admin
(or whatever) if they want better performance with QUIC 🤔
Since the image switched to non-root, this wouldn't be sufficient? You can only add caps for the root
user AFAIK, when the container switches to a non-root user (even if the image sets that via USER
instruction in Dockerfile
) the caps are dropped IIRC. Unless this was about just relaxing the bounding set for =ep
to work.
You can set them via file attributes as a "capability-dumb" binary (=ep
), or if Caddy has capability awareness, it'd check for the capability as permitted (=p
via setcap
needed in a container since ambient caps aren't available) where it can then raise the permitted capability into the effective set.
I'm not a security expert, and I know it's used for other software like Fail2Ban to do it's thing, just making note of it here since the image adopted non-root user, but granting capabilities that add risk may be worth noting (especially when not absolutely needed), rather than just configuring for such implicitly 😅 (granted this requires an explicit --cap-add
anyway..)
If you're going to document the --cap-add
though, you could consider instead documenting the sysctl
that's referenced as an alternative, assuming that's valid to set on the container like the equivalent for CAP_BIND_NET_SERVICE
is (not necessary in Docker since that's already set to 0
by default making the capability unnecessary unless altered).
As a (happy!) user of Caddy in Docker, I would prefer to manually update the UDP receive and send buffer settings rather than have Caddy handle this with the CAP_NET_ADMIN capability on my behalf. I appreciate the focus on usability and performance, but I share @polarathene's remarks that this capability could pose security risks. Perhaps it's possible instead to include some copy-pastable lines in the documentation that make this manual effort less of a burden?
Perhaps it's possible instead to include some copy-pastable lines in the documentation that make this manual effort less of a burden?
@childersd the NETADMIN capability is already documented [in the DockerHub docs](https://hub.docker.com//caddy/) (scroll to Linux capabilities
) - maybe that needs to be more clear? or something?
Given it's been almost a year, this has stalled, and given the potential security impact, I'm going to just close this.
(not necessary in Docker since that's already set to
0
by default making the capability unnecessary unless altered).
👆 For Docker the capability isn't required, as the actual need for it is better managed by a sysctl
.
caddy --help
and other commands for example wouldn't need NET_ADMIN
_).EDIT: I mixed up the capability NET_ADMIN
with BIND_NET_SERVICE
, but both have a sysctl
that can be tuned instead and NET_ADMIN
is required only for an enhancement (QUIC for HTTP/3), thus shouldn't be enforced given my earlier points as it's not mandatory to run Caddy.
maybe that needs to be more clear? or something?
It's clear, although an inline comment above the capability in the compose.yaml
example could inline context for why, or direct the user to the "Linux capabilities" section right above it.
The equivalent sysctl
is not namespaced, thus cannot be adjusted per container. I would still advise raising the needed cap at runtime via Go rather than making it mandatory via kernel check to run caddy
as this PR proposed.
Related to https://github.com/caddyserver/dist/issues/97