Closed luckylinux closed 2 months ago
A friendly reminder that this issue had no activity for 30 days.
Yep, difficult to replicate unfortunately :(.
@luckylinux What podman version are you using currently? Do you still see this issue?
@Luap99: Podman 4.9.4 on Debian/Ubuntu, Podman 5.1.1 on Fedora
As for the Issue, well, the moment I say "No" is the moment where the Issue will occur, so I'm NOT going to say anything :laughing:.
I did some changes around the auto restart networking behaviour recently (https://github.com/containers/podman/commit/15b8bb72a8e984c56a5f9a38986b651971182e84) v5.1 so I wonder if it had anything to do with that.
I am going to close this but feel free to ping me when it happens again
Issue Description
It seems that for some reason, in some cases (maybe killing a
podman create
orpodman run
or runningpodman-compose up -d
before runningpodman-compose down
first) some Network Configuration is never removed.This leads to Inconsistent Behavior when running "Client" Containers.
In the Specific case, I am running a PostgreSQL Database Server First
migration-postgresql-testing
.The I run a PGLoader Image as well as several other PostgreSQL Database Server Containers with Client-only Functions (overkill, but I did not find an image which only contained
psql
,pg_dump
andpg_restore
).The Client Container, then, cannot consistently resolve the DNS name of
migration-postgresql-testing
.I lost approximatively 2 days trying to sort this out, because sometimes some Container Instances seem to work, adding a
sleep 5-60
prior to executing the command seemed to help in some cases, but never really fix the issue.I'd say, statistically, 50% of the Client Containers would fail to reach the Database Server, with message beeing either of:
The Faulty Configuration can be found by just issueing a
grep -r {{DANGLING_IP_ADDRESS}}
:Steps to reproduce the issue
Steps to reproduce the issue
When I was having this issue I tried a lot of different steps, including:
podman system reset
graphRoot
Folder (asroot
, to ensure a good "fresh" start)podman stop
/podman rm
before anypodman run
/podman create
podman-compose down
before anypodman-compose up -d
It is possible that the issue occurs for instance when doing
podman-compose up
then sendingSIGTERM
withCTRL+C
, then re-runningpodman-compose up -d
. Or similar Situation withpodman
directly Not sure.Another Option, which I had many Issues Yesterday, was related to how
podman
takes arguments as well as (my very basic :upside_down_face: ) knowledge of BASH Variable Expansions inside functions. You can have a look at https://github.com/luckylinux/migrate-sql-database/blob/main/functions.sh but right now I settled with a BASH Array for every argument, then expand that using "${ARRAY[*]}" and finally using BASHeval
to run the command.I say this because many times my script
migrate.sh
would improperly not quote correctly the argument tobash -c
, resulting inCommand not found
errors. Other times, it seemed that thepodman run
(which I was using previously instead ofpodman create
+podman start
whichpodman-compose up -d
is using) didn't like the arguments that I was passing (either due to double quotes or lack thereof, Variable Expansion not working well with Items containing Spaces, ...). To try to replicate it's maybe possible to look at previous versions offunctions.sh
as well as old versions ofmigrate.sh
.I was using code such as for instance https://github.com/luckylinux/migrate-sql-database/commit/b88e6876b0bcd7ae420e3a9f8bd297cfada380e1 or previous commits, beforing using BASH ARRAYS for Argument Expansion like I'm doing now.
Unsure. Maybe something there left podman in an unclean State ?
Describe the results you received
Describe the results you received.
Example of Result BEFORE removing the offending
/run/user/1000/networks/aardvark-dns/homeassistant
file. Please note that the DNS Hostname Resolution keeps jumping between10.89.0.109
(correct) and10.89.0.21
(dangling Configuration of the Same Containername
but NOTid
):Example AFTER removing the offending
/run/user/1000/networks/aardvark-dns/homeassistant
file:Finding the culprit by running
grep -r "10.89.0.21" /run/user/1000/
Even if a "Dangling Network" has been previously causing issues, the offending Containers (that have now been deleted) still reside inside the file.
cat /run/user/1000/networks/aardvark-dns/homeassistant_internal
yields for instance:This occurs also if, to try to replicate the issue, I use the same network name
homeassistant_internal
. Newly Created Containers get correctly added to the file, and removed, if they are removed and the Network is not listed in the updated--net
section ofpodman run
orpodman create
.But if I now use
homeassistant_internal
for my "loop" script "normal" Container Network, then I can replicate the issue, provided that the dangling file is around:cat /run/user/1000/networks/aardvark-dns/homeassistant_internal
yields:With Script Content changed slightly:
And Result:
But why is there a file dangling in the first place ???
And why isn't
podman rm
deleting OLD entries of that container from /run/user/1000/networks/aardvark-dns/{{DANGLING_NETWORK}} ?Describe the results you expected
DNS Resolution working Correctly.
podman stop
&podman rm
removing the deprecated Hostname-IP Address association from the Temporary/run/user/1000/networks/aardvark-dns/{{NETWORK_NAME}}
File.For instance, why doesn't
podman rm migration-postgresql-testing
change the file/run/user/1000/networks/aardvark-dns/homeassistant_internal
to: `` 10.89.1.1 954d063412cbf12ddcb453931a3b980d41a5ddef53e88cbc665c624dcd041429 10.89.1.243 network-debug-utils,954d063412cbPodman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
No
Additional environment details
Baremetal Ubuntu AMD64 Host.
Ubuntu Mantic 23.10 with Podman 4.9.3 pinned from Ubuntu Testing/Noble.
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting