Open jpalus opened 1 month ago
Did you run out of inotify instances for you user by chance? Can you check if you see ...use a timer
in your journal written by pasta?
Fundamentally the netns quit code is racy, not just for pasta but slirp4netns and rootlessport as well... For pasta it exits when we unmount the netns path (either via inotify or 1s poll interval), so when ports are bound and we restart we may launch pasta before the old pasta exited. The use of the timer make it of course much easier to reproduce...
With slirp4netns and rootlessport it is better because we send a pipe down into conmon and once conmon closes the pipe they will exit. The close of the pipe happens directly after the container died so chances of a race are much more unlikely as the window until the restart is much longer compared to the step where we unmount the netns. However it still is racy.
Overall I think the right fix would be to track the pid of the processes and SIGKILL them in container teardown. But SIGKILL is not synchronous either so we still would wait for the exit somehow. We have problems of potential pid reuse, when the network proces exits podman cannot notice this as we are not a deamon, then another process might get the same pid and we start killing the wrong process. We do have such problems elsewhere so it is already an accepted risk I would say.
A friendly reminder that this issue had no activity for 30 days.
Issue Description
podman restart <container>
fails with:when using
pasta
with published port. Works fine withslirp4netns
thoughSteps to reproduce the issue
Steps to reproduce the issue
Describe the results you received
Restart fails:
Describe the results you expected
Restart succeeds.
podman info output
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
Additional environment details
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting