containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.73k stars 2.41k forks source link

Intel macOS qemu_podman-machine-default.sock: connect: no such file or directory #13609

Closed fithisux closed 1 year ago

fithisux commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I upgraded my podman with brew to 4.0.2. I tried to use it but it does not work.

Steps to reproduce the issue:

1.

brew uninstall podman
brew install podman
  1. podman machine rm
    podman machine init
    podman machine start

Describe the results you received:

Starting machine "podman-machine-default"
INFO[0000] waiting for clients...
ERRO[0000] Error listening on socket: /Users/vassilisanagnostopoulos/.local/share/containers/podman/machine/podman-machine-default/podman.sock: listen unix /Users/vassilisanagnostopoulos/.local/share/containers/podman/machine/podman-machine-default/podman.sock: bind: invalid argument
Error: dial unix /var/folders/70/tpb1l8jd45s8zj742ljrmbs80000gp/T/podman/qemu_podman-machine-default.sock: connect: no such file or directory

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman. failed to create sshClient: Connection to bastion host (ssh://core@localhost:51528/run/user/502/podman/podman.sock) failed.: dial tcp [::1]:51528: connect: connection refused

Output of podman info --debug:

Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman. failed to create sshClient: Connection to bastion host (ssh://core@localhost:51528/run/user/502/podman/podman.sock) failed.: dial tcp [::1]:51528: connect: connection refused

Package info (e.g. output of rpm -q podman or apt list podman):

(paste your output here)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.): Darwin C02CH2W2MD6R.local 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:31 PDT 2021; root:xnu-7195.141.2~5/RELEASE_X86_64 x86_64

Luap99 commented 2 years ago

On darwin and other BSD operation system the maximum path length for the socket path seems to be 104 chars. On linux it is 108 chars. Your path is 105 chars long.

Can you try to create a user with a shorter name and see if this works?

baude commented 2 years ago

consider the following:

  1. podman system connection ls ; how many listings are there. Are you certain the default listing is correct for the default machine? if you dont know and don't care, rm the machine; then double check system connections. if some still exist, remove them and start over at init.
  2. do the dirs being cited exist? do you have write permissions to them?
  3. are there any security policies on your mac that are blocking things?
baude commented 2 years ago

wooh, @Luap99 nice find.

Luap99 commented 2 years ago

@baude Maybe this workaround helps: https://github.com/containers/podman/blob/f049cba47c31d31a4a8ed9a9180f0e847be3411c/cmd/rootlessport/main.go#L201-L217

But I have no idea if this would work on darwin.

baude commented 2 years ago

@Luap99 or @fithisux you could also create a shortened machine name. i.e. podman init --now foo

baude commented 2 years ago

verified on my m1 that indeed the long name throws the same errors.

baude commented 2 years ago

i got this one ..,.

fithisux commented 2 years ago

On darwin and other BSD operation system the maximum path length for the socket path seems to be 104 chars. On linux it is 108 chars. Your path is 105 chars long.

Can you try to create a user with a shorter name and see if this works?

Unfortunately this is not possible. But my issue is that the files / sockets is not there at all

`Last login: Wed Mar 23 15:14:37 on ttys001 ➜ ~ ls /var/folders/70/tpb1l8jd45s8zj742ljrmbs80000gp/T/podman

➜ ~ ls -l /Users/vassilisanagnostopoulos/.local/share/containers/podman/machine/podman-machine-default

➜ ~`

fithisux commented 2 years ago

podman init --now foo

This does not seem to work

➜ ~ podman init --now foo Error: unknown flag: --now See 'podman init --help' ➜ ~

Luap99 commented 2 years ago

use podman machine init ...

fithisux commented 2 years ago

use podman machine init ...

Works now. Thank you.

afbjorklund commented 2 years ago

You should still be able to use ssh sockets just fine, it's just the legacy unix sockets that have this limit.

pigping88 commented 1 year ago

mac apple A1 , podman 4.3.1 podman machine init podman machine start

Starting machine "podman-machine-default"
Waiting for VM ...
Error: dial unix /var/folders/f3/v585c0593yg1v4qqs5tjnw600000gn/T/podman/podman-machine-default_ready.sock: connect: no such file or directory

podman machine ssh

Connecting to vm podman-machine-default. To close connection, use `~.` or `exit`
Fedora CoreOS 37.20221211.2.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/tag/coreos
ssbarnea commented 1 year ago

Reopening because this issue is still happening with GHA runners as today, see https://github.com/ansible/vscode-ansible/actions/runs/4005037985/jobs/6874912745

2023-01-25T10:34:30.5842600Z Downloading VM image: fedora-coreos-37.20230110.2.0-qemu.x8…
2023-01-25T10:34:30.7628400Z Downloading VM image: fedora-coreos-37.20230110.2.0-qemu.x8…
2023-01-25T10:34:37.9151470Z Extracting compressed file
2023-01-25T10:35:09.9899520Z Image resized.
2023-01-25T10:35:09.9923940Z Machine init complete
2023-01-25T10:35:09.9935290Z Starting machine "podman-machine-default"
2023-01-25T10:35:10.5067840Z Waiting for VM ...
2023-01-25T10:35:13.5424550Z Error: dial unix /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/podman/podman-machine-default_ready.sock: connect: no such file or directory
2023-01-25T10:35:13.5717740Z task: Failed to run task "setup": exit status 125
2023-01-25T10:35:13.5751490Z ##[error]Process completed with exit code 1.

I think that the command that caused it to fail was podman machine init --now.

ssbarnea commented 1 year ago

@baude Any idea what could have cause this regression? What can we do to make the podman initialization reliable on GHA?

baude commented 1 year ago

this is sort of a general error for "something went wrong". it is entirely possible that it is different than this issue .... does removing the machine and recreating it help?

ssbarnea commented 1 year ago

While ok locally, turn-in-off-and-on does not really work with GHA. Still, I observed that while it does reproduce, it does not always reproduce, so there is a level of randomness in it.

If you could provide some hints regarding how we can get extra logs when happens, I might be able to alter the GHA pipelines to collect the extra information.

If we manage to get podman to be reliable on GHA, we might have a chance on convincing github to add it to the default runner image.

ctrought commented 1 year ago

My system (m1 mac) hit an out of memory condition while podman was running which led to this issue on startup #16945 , rebooted and then podman won't startup now hitting the error in this issue :/ Any workaround that does not involve wiping the original podman machine?

$ podman machine start  --log-level debug
INFO[0000] podman filtering at log level debug
Starting machine "podman-machine-default"
[/opt/podman/qemu/bin/gvproxy -listen-qemu unix:///var/folders/5k/865_x9vd2_3f3k7bw36k87vc0000gp/T/podman/qmp_podman-machine-default.sock -pid-file /var/folders/5k/865_x9vd2_3f3k7bw36k87vc0000gp/T/podman/podman-machine-default_proxy.pid -ssh-port 50044 -forward-sock /Users/XXXXXXXXXXXX/.local/share/containers/podman/machine/podman-machine-default/podman.sock -forward-dest /run/user/502/podman/podman.sock -forward-user core -forward-identity /Users/XXXXXXXXXXXX/.ssh/podman-machine-default --debug
Error: dial unix /var/folders/5k/865_x9vd2_3f3k7bw36k87vc0000gp/T/podman/qmp_podman-machine-default.sock: connect: no such file or directory
spencerrung commented 1 year ago

Agree this should be fixed but an easy workaround for M1 mac users

podman machine stop
podman machine rm
podman machine init
podman machine start
vrothberg commented 1 year ago

That has been fixed in the meantime, closing.