Open MichaelVoelkel opened 10 months ago
Hm, having set the device explicitly to eth0, I see now some rules (yeah, I tried other containers, too, but nothing works) and nftables seems to reflect it, but no change in behaviour. Apart from that I see that journalctl, despite logs, does not show the incoming 9443 packets anymore.
table inet filter {
chain input {
type filter hook input priority filter; policy drop;
tcp dport 22 accept
log
log
}
chain forward {
type filter hook forward priority filter; policy drop;
}
chain output {
type filter hook output priority filter; policy accept;
}
}
table inet dfw {
chain input {
type filter hook input priority filter - 5; policy accept;
ct state invalid drop
ct state { established, related } accept
iifname "docker0" meta mark set 0x000000df accept
}
chain forward {
type filter hook forward priority filter - 5; policy accept;
ct state invalid drop
ct state { established, related } accept
iifname "docker0" oifname "eth0" meta mark set 0x000000df accept
tcp dport 9443 ip daddr 172.17.0.2 iifname "eth0" oifname "br-22eb53281a80" meta mark set 0x000000df accept
tcp dport 8000 ip daddr 172.17.0.2 iifname "eth0" oifname "br-22eb53281a80" meta mark set 0x000000df accept
tcp dport 9115 ip daddr 172.20.0.3 iifname "eth0" oifname "br-d65ccc79fc1d" meta mark set 0x000000df accept
}
}
table ip dfw {
chain prerouting {
type nat hook prerouting priority dstnat - 5; policy accept;
tcp dport 9443 iifname "eth0" meta mark set 0x000000df dnat to 172.17.0.2:9443
tcp dport 8000 iifname "eth0" meta mark set 0x000000df dnat to 172.17.0.2:8000
tcp dport 9115 iifname "eth0" meta mark set 0x000000df dnat to 172.20.0.3:9115
}
chain postrouting {
type nat hook postrouting priority srcnat - 5; policy accept;
oifname "eth0" meta mark set 0x000000df masquerade
}
}
table ip6 dfw {
chain prerouting {
type nat hook prerouting priority dstnat - 5; policy accept;
tcp dport 9443 iifname "eth0" meta mark set 0x000000df
tcp dport 8000 iifname "eth0" meta mark set 0x000000df
tcp dport 9115 iifname "eth0" meta mark set 0x000000df
}
chain postrouting {
type nat hook postrouting priority srcnat - 5; policy accept;
oifname "eth0" meta mark set 0x000000df masquerade
}
}
Ok, my log was stupid because now I want to log FORWARD of course. And there I see something:
Jan 01 13:25:18 v62887.php-friends.de kernel: IN=eth0 OUT=br-d65ccc79fc1d MAC=<filtered> SRC=<filtered> DST=172.20.0.2 LEN=64 TOS=0x00 PREC=0x00 TTL=50 ID=0 DF PROTO=TCP SPT=54694 DPT=9115 WINDOW=65535 RES=0x00 SYN URGP=0 MARK=0xdf
Well, this seems fine. The packet is filtered towards the docker container but for some reason nothing happens there hmmm...
OUT is a bit strange though, this is some veth0 interface because this whole thing runs on a KVM virtual machine... (not managed by me but my provider where I buy the hosting solution, was not a problem so far though).
Ok, I needed to also add "backend_defaults"... I somehow thought this would not be needed as it was default anyways.
Also in your docs you describe some sample nftables.conf file... This is a REALLY bad one because it will also not allow pinging out or working with established connections. Maybe replacing it with something with sensible rules would make more sense?
I suggest:
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
tcp dport 22 accept
ct state invalid drop
ct state { established, related } accept
ip protocol icmp icmp type echo-request accept;
icmpv6 type echo-request accept;
}
chain forward {
type filter hook forward priority 0; policy drop;
ct state { established, related } accept
}
chain output {
type filter hook output priority 0; policy accept;
}
}
although it's certainly incomplete because ping6 does not work yet... anyways
Hi @MichaelVoelkel, thanks for reaching out. I'll try to go through your various points one-by-one, although some might overlap with others. 🙂
I'm puzzled why I need to state a network. Before I just had "bridge" network, so I also tried putting that as name. Later, I created a new custom bridge network and connected the container to it (which I double-checked via docker inspect) that I called portainer_network and which you see in my rules-file now.
Every Docker container you run has to be attached to some kind of Linux network interface, at least assuming it should be able to connect to the network (which it does unless you specify --network none
). Docker, when you run a container without specifying --network
, does use the default bridge
network it creates for itself when it first starts up.
Given this fact that a Docker container always will be associated with a virtual bridge network interface, for firewalling to work, nftables has to know which network interface packets are destined for or coming from and thus DFW has to know too, for it to be able to create rules with the correct constraints.
I was unsure whether I needed to restart the container first or not, so I did that. I also stopped and started it (which retained the specific network)... nothing helped here.
Unless you run DFW with the --run-once
flag (which you haven't according to your first post), DFW will automatically update the nftables ruleset whenever anything surrounding Docker containers changes. So if you start a container after DFW is already running, and a rule you have defined applies to the container, DFW will automatically roll out this new rule.
If you have started your applications before you started DFW, DFW will still automatically apply all relevant rules, because it also applies all rules whenever it starts up.
/etc/nftables
exists but I don't use it. I hope it makes no trouble.
I am fairly certain that you are using it, even if you don't think you are: the nftables
systemd-service uses this file to apply rules when it launches. You can verify this using this command:
$ cat "$(systemctl show -P FragmentPath nftables.service)" | grep '^Exec'
ExecStart=/usr/sbin/nft -f /etc/nftables.conf
ExecReload=/usr/sbin/nft -f /etc/nftables.conf
ExecStop=/usr/sbin/nft flush ruleset
This means that the systemd-unit nftables.service
during start and reload just instructs nftables through the nft
command to load the ruleset from the /etc/nftables.conf
file. You can verify that the nftables
service is used through this command:
$ systemctl show --property ActiveState --property UnitFileState nftables.service
ActiveState=active
UnitFileState=enabled
If the unit is active and enabled, it works as I have described above.
The reason I'm going into so much detail here: my personal suggestion for setting up nftables is to configure your base-rules in /etc/nftables.conf
, i.e. primarily rules that are not directly related to the Docker containers you are running, and then have DFW take care of the rest.
Following is the /etc/nftables.conf
file that I'm using:
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
# Allow local traffic
iif lo accept
# Allow related traffic (-> stateful connection tracking)
ct state { established, related } accept
# Setup ICMP and ICMPv6
icmp type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable } accept
icmpv6 type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable, packet-too-big, nd-router-advert, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld-listener-query } accept
# Configure SSH
tcp dport 22 accept
# reject traffic instead of just dropping it
reject with icmpx type port-unreachable
}
chain forward {
type filter hook forward priority 0; policy drop;
}
chain output {
type filter hook output priority 0; policy accept;
}
}
A few things to note in the input
hook:
iif lo accept
enables me to access services locally, which the drop
policy would otherwise disallow.
ct state { established, related } accept
is added to allow stateful tracking of traffic, as you have also suggested.
This is actually not really necessary, because DFW will by default add both ct state invalid drop
and ct state { established, related } accept
rules to the input
hook. The reason for this is that DFW expects stateful tracking to take place, and thus forces the creation of these rules.
I still add ct state { established, related } accept
here though because I want responses to connections to work on server startup even before DFW has run.
I allow ICMP.
I have a final rule to reject any traffic that didn't match, rather than just drop
ping it (which I understand to be good hygiene if you don't want try to hide your host).
(it has the ports mapping of 9443:9443, so I should access it via localhost, e.g.,
nc -vv localhost 9443
). As connection times out both ways, I'd assume that the package is dropped.
You are correct that the package will be dropped. As shown above I add the iif lo accept
rule to enable this kind of traffic to work. I think adding that to the default documentation would likely make sense, because it is very confusing if local traffic doesn't work.
Ok, I needed to also add "backend_defaults"... I somehow thought this would not be needed as it was default anyways.
Do I understand correctly that things work now after you have added backend_defaults
, but didn't before?
One thing I did notice in the rulesets you have posted is that DFW does not hook itself into the filter
tables, which it will do if the backend_defaults
are set up like this:
[backend_defaults]
custom_tables = { name = "filter", chains = ["input", "forward"] }
You can find more details on this field in the documentation here, but the gist of it is this: DFW has to be able to act on traffic when it traverses any one of the input
or forward
hooks. This can be achieved in one of three ways:
Have no other tables that hook input
or forward
, leaving only DFW's table.
This is not really feasible, because that would leave you with an entirely open firewall, at least until DFW has run.
Ensure that any existing tables that hook input
or forward
don't drop the traffic before it reaches DFW's tables.
This is not great if you want your input
hook to have a drop
policy, which I personally would always want, just to make sure I don't accidentally expose any port I didn't intend to expose.
Let DFW know about any existing tables and chains that hook input
or forward
.
This is what the custom_tables
setting does and it give us the best of both worlds: we can ensure that DFW can correctly accept traffic it is responsible for while still being able to default drop
traffic in the input
hook.
This is a REALLY bad one because it will also not allow pinging out or working with established connections.
Assuming DFW has run and is instructed to attach to the existing tables it would work: the output
hook does not deny the echo-request (ping) and the input
hook would allow related packets to let the echo-response (pong) come through. Without it having run though, the default would disallow this from happening, yes.
Regarding established/related packets: I agree, this should be part of the default config. Regarding allowing incoming pings: I don't want to prescribe to a user of DFW whether they want their host to be pingable. I think a good middle ground would be to add it with a comment, i.e. indicating to the user that their host won't be pingable unless they enable that rule.
In summary, the final more-than-minimal configuration that works well for me is this:
/etc/nftables.conf
:
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
# Allow local traffic
iif lo accept
# Allow related traffic (-> stateful connection tracking)
ct state { established, related } accept
# Setup ICMP and ICMPv6
icmp type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable } accept
icmpv6 type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable, packet-too-big, nd-router-advert, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld-listener-query } accept
# Configure SSH
tcp dport 22 accept
# reject traffic instead of just dropping it
reject with icmpx type port-unreachable
}
chain forward {
type filter hook forward priority 0; policy drop;
}
chain output {
type filter hook output priority 0; policy accept;
}
}
dfw/main.toml
:
[global_defaults]
external_network_interfaces = [
"eno1",
]
[backend_defaults]
custom_tables = { name = "filter", chains = ["input", "forward"] }
dfw/wwtc.toml
:
[[wider_world_to_container.rules]]
network = "reverseproxy"
dst_container = "traefik-traefik-1"
expose_port = 443
Tasks:
iif lo accept
to documentation for /etc/nftables.conf
.
https://github.com/pitkley/dfw/pull/711ct state { established, related } accept
to the documentation for /etc/nftables.conf
.
https://github.com/pitkley/dfw/pull/711/etc/nftables.conf
.nft --json
output and find out whether there are chains it has to hook into, making initial onboarding easier.)
https://github.com/pitkley/dfw/issues/710Hi! Thanks for your great, long answer. Yeah, now everything works, maybe apart from locally accessing containers, BUT I will try out your iif lo
filter because I don't have this one.
And all sounds really interesting what you write. Of course, it's true that nftables is used as base. And yes to:
Do I understand correctly that things work now after you have added backend_defaults, but didn't before?
My default policy clearly is drop. And by the way, as for pings, I was just talking to pings going outside. I agree that inside pings are a different story.
As for the network thingy, I was just thinking, if the docker container only has one network, dfw could theoretically read it from it and use it, as a convenience idea. But yeah, that's not necessarily needed.
All in all, I need to say: your solution is great!!
Getting nftables running with docker is normally not doable nicely... And it should really be a firewall solution that sits in its own docker container to have it maintainable. This seems like the best practice. And your repo offers exactly this solution. So thanks a lot!
Hi,
being on Debian12, I switched to NFT now again. The basic configuration is just from the docs,
/etc/nftables.conf
:/etc/nftables
exists but I don't use it. I hope it makes no trouble.I can connect to ssh and nothing else works, so far so good.
Now I have portainer running on 9443 where I could before have world access and host access (it has the ports mapping of 9443:9443, so I should access it via localhost, e.g.,
nc -vv localhost 9443
). As connection times out both ways, I'd assume that the package is dropped.My
rules.toml
is: (yeah, small, nothing else)I run
dfw
currently like this to see the logs:And yeah, stuff like
nc
etc. I do via a second ssh session, so I keep it open. :)So, my problem clearly is that I cannot connect but would hope/expect to do so.
Some more information / pecularities / questions / comments:
sudo nft list ruleset
does not seem to show any rules that have been created, is this expected?I'm puzzled why I need to state a network. Before I just had "bridge" network, so I also tried putting that as name. Later, I created a new custom bridge network and connected the container to it (which I double-checked via
docker inspect
) that I calledportainer_network
and which you see in my rules-file now.I was unsure whether I needed to restart the container first or not, so I did that. I also stopped and started it (which retained the specific network)... nothing helped here.
Probably I messed up something very basic. I tried also to make sure that old iptables is disabled but
sudo systemctl disable iptables
told me it did not even know iptables.Hm, when I reset nft rules to /etc/nftables.conf, it's empty though, when I start dfw, it fills up to the config shown above, so it seems to do something.