Open stuartm opened 1 week ago
I was forwarding port 53 (tcp/udp) on the host to port 53 on the container
Let's focus on this for a moment. Can you check what process port 53 is bound to? fuser -n tcp 53
and fuser -n udp 53
. I wonder if another process you don't expect is stealing your packets.
Last night after opening this ticket I had a thought. It was just too great a coincidence that of all the ports that were not working it was the exact two forwarded for this container which were having issues.
Immediately after the upgrade when I first started the container it had failed to start due to a conflict on port 53 with aardvark - I changed the port forwarding to listen only on the external IP and restarted it which fixed the binding issue but then I found that port forwarding was not working. I formed a theory that forwarding rules had been created on this first start but then not removed when it subsequently failed to bind to port 53 on all addresses. Rather than messing about trying to find where these rules were (in hindsight probably just iptables?) I just restarted the host and that fixed the issue.
So there is a bug here, but it's not what I thought, it appears under a certain failure scenario podman is creating port forwarding rules but then not cleaning them up correctly. Resulting in all traffic sent to those ports presumably being sent to the incorrect container IP.
This might also explain why so many people were having issues specifically with pihole, standard instructions for pihole have port forwarding on port 53 enabled for all addresses by default. With the introduction of aardvark on the container interface this would result in the first start of the pihole container always failing, like me many would have restricted pihole to listen on an external interface and then recreated the container only to find that things were still not working. Most of those people would have also restarted the host at some point, finding that the issue disappeared which explains why the original ticket was abandoned.
I don't know if you want to leave this ticket open for the orphaned port forwarding rule bug, or not, I leave that decision to you.
I don't know if you want to leave this ticket open for the orphaned port forwarding rule bug, or not, I leave that decision to you.
I have no idea how that part works, but if there are stale nftables port forwarding rules (I'm not sure what component would add them?) then there's an actual issue somewhere...
Immediately after the upgrade when I first started the container it had failed to start due to a conflict on port 53 with aardvark - I changed the port forwarding to listen only on the external IP and restarted it which fixed the binding issue but then I found that port forwarding was not working. I formed a theory that forwarding rules had been created on this first start but then not removed when it subsequently failed to bind to port 53 on all addresses. Rather than messing about trying to find where these rules were (in hindsight probably just iptables?) I just restarted the host and that fixed the issue.
Yeah looking at the code it seems if we fail to start aardvark-dns we forget to teardown the driver again here https://github.com/containers/netavark/blob/d3769ed70ce02497e715fe4a3c5c2ea62938c113/src/commands/setup.rs#L152-L156
At least that is what I assume from your description but we definitely do not cleanup on failure there so leaking iptables rules and interface are to be expected in such case.
I move the issue to netavark
Issue Description
Podman version 5.2.3
The issue I'm seeing is identical to containers/podman#14365 which was closed and locked due to inactivity, but it seems was never resolved and was affecting at least a few people.
I recently updated my server from Fedora 39 to Fedora 40 following which a pihole container which was working perfectly before the upgrade stopped functioning, or rather as it turns out port forward for that container stopped working and in a rather interesting way.
I was forwarding port 53 (tcp/udp) on the host to port 53 on the container, I was also forwarding port 8888 on the host to port 80 on the container for pihole's admin interface. After the upgrade port forwarding broke for both ports.
I've played around with different caps, I disabled selinux enforcement on the host and disabled the firewall (although it was correctly configured). I've checked and checked the container configuration, and even managed to prove that it working as expected except for the port forwarding issue (see below).
To cut a very long story short, here's what I discovered after hours of trying to get things working again. I was able to reach both ports through the container IP, thus demonstrating that the container was functioning correctly. When I changed the ports used for forwarding 8888 > 8765 and 53 > 54 port forward worked! Therefore the issue is specific to certain ports, in my experience 53 as in the original ticket but also others including 8888.
A half dozen other containers, all with port forwards are unaffected by this issue.
I can't see an obvious connection between port 53 and 8888 however maybe those two ports share something in common that triggers a thought for someone.
Steps to reproduce the issue
Describe the results you received
Port forwarding for some ports is resulting in traffic just disappearing into the void.
Describe the results you expected
Traffic forwarded from the host on mapped ports to reach the container
podman info output
Podman version 5.2.3 Fedora 40 x86_64
Podman in a container
No
Privileged Or Rootless
Privileged
Upstream Latest Release
No
Additional environment details
Additional environment details
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting