containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.71k stars 2.41k forks source link

`podman machine list` unable to list the available machines on windows #18466

Closed anjannath closed 1 year ago

anjannath commented 1 year ago

Issue Description

podman machine list doesn't show the available podman machine, instead it exits with the following error:

Error: listing vms: could not enable linger for remote user on guest OS: exit status 0xffffffff
PS C:\Users\anath> podman machine list --log-level debug
time="2023-05-04T20:50:34+05:30" level=info msg="C:\\Program Files\\RedHat\\Podman\\podman.exe filtering at log level debug"
time="2023-05-04T20:50:34+05:30" level=debug msg="Running command: wsl [-u root -d podman-my-machine sh -c mkdir -p /var/lib/systemd/linger; touch /var/lib/systemd/linger/user]"
The system cannot find the path specified.
Error: listing vms: could not enable linger for remote user on guest OS: exit status 0xffffffff
time="2023-05-04T20:50:34+05:30" level=debug msg="Shutting down engines"

Steps to reproduce the issue

Steps to reproduce the issue

  1. Install podman using the installer for windows
  2. podman machine init then podman machine start
  3. podman machine list

Describe the results you received

Got an error saying:

Error: listing vms: could not enable linger for remote user on guest OS: exit status 0xffffffff

Describe the results you expected

Expected to get a list of the available podman machines.

podman info output

PS C:\Users\anath> podman info
host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.7-2.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 98.03
    systemPercent: 1.23
    userPercent: 0.74
  cpus: 8
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: container
    version: "37"
  eventLogger: journald
  hostname: DESKTOP-J8H86J0
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 5.10.102.1-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 26544537600
  memTotal: 26808979456
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.4-1.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.4
      commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 7516192768
  swapTotal: 7516192768
  uptime: 0h 1m 35.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/user/.local/share/containers/storage
  graphRootAllocated: 269490393088
  graphRootUsed: 700530688
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 4.5.0
  Built: 1681486976
  BuiltTime: Fri Apr 14 21:12:56 2023
  GitCommit: ""
  GoVersion: go1.19.7
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.0


### Podman in a container

No

### Privileged Or Rootless

None

### Upstream Latest Release

Yes

### Additional environment details

Additional environment details

### Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
Luap99 commented 1 year ago

@n1hility PTAL

gbraad commented 1 year ago

@anjannath can you describe more about the lead up? was this a clean installation or did podman wsl already exist?

Same for me; every command shows an error for Podman 4.5.0:

Error: could not enable linger for remote user on guest OS: exit status 0xffffffff

Solution

I had to forcibly run:

PS> wsl --unregister podman-machine-default

to continue, as

PS> wsl --list
Windows Subsystem for Linux Distributions:
fedorawsl (Default)
podman-machine-default

showed the presence of an unusable WSL environment. This is why the error was returned; the WSL distro was not able to start or otherwise in an unusable state.

After this I can run:

PS> podman machine init
There is no distribution with the supplied name.
Error code: Wsl/Service/WSL_E_DISTRO_NOT_FOUND
Downloading VM image: fedora-podman-amd64-v37.0.16.tar.xz: done
Extracting compressed file
Importing operating system into WSL (this may take a few minutes on a new WSL install)...
Import in progress, this may take a few minutes.
The operation completed successfully.
Configuring system...
Keys already exist, reusing
Error: cannot overwrite connection

with some errors as not all was removed with this action.

PS> podman machine list
NAME                    VM TYPE     CREATED             LAST UP             CPUS        MEMORY      DISK SIZE
podman-machine-default  wsl         About a minute ago  About a minute ago  0           0B          717.2MB

Suggestion

I would suggest something like a podman machine clean (or rm -f) to remove ANY reference to a WSL environment, removal of keys, etc. As the user might end up with a non-functional environment.

Motivation

PS> podman ps
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.5.0/libpod/_ping": dial unix ///run/podman/podman.sock: connect: An invalid argument was supplied.

The suggested init (was performed) and start does not give the expected result as suggested:

PS> podman machine start
Starting machine "podman-machine-default"

This machine is currently configured in rootless mode. If your containers
require root permissions (e.g. ports < 1024), or if you run into compatibility
issues with non-podman clients, you can switch using the following command:

        podman machine set --rootful

API forwarding listening on: npipe:////./pipe/docker_engine

Docker API clients default to this address. You do not need to set DOCKER_HOST.
Machine "podman-machine-default" started successfully

PS> podman ps
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.5.0/libpod/_ping": dial unix ///run/podman/podman.sock: connect: An invalid argument was supplied.

Actions needed

To perform a clean, something similar to a stop and rm is necessary

PS> podman machine stop
Machine "podman-machine-default" stopped successfully

PS> podman machine rm

The following files will be deleted:

C:\Users\gbraad\.ssh\podman-machine-default
C:\Users\gbraad\.ssh\podman-machine-default.pub
C:\Users\gbraad\.local\share\containers\podman\machine\wsl\podman-machine-default_fedora-podman-amd64-v37.0.16.tar
C:\Users\gbraad\.config\containers\podman\machine\wsl\podman-machine-default.json
C:\Users\gbraad\.local\share\containers\podman\machine\wsl\wsldist\podman-machine-default

and the previously mentioned wsl --unregister. Having no command to force the wsl --unregister would otherwise need knowledge of how wsl operates. The removal of the keys/machine definition should be forced to prevent a (re)init to end up in an usable state for the podman command.

Result

PS> podman machine init
Extracting compressed file
Importing operating system into WSL (this may take a few minutes on a new WSL install)...
Import in progress, this may take a few minutes.
The operation completed successfully.
Configuring system...
Generating public/private ed25519 key pair.
Your identification has been saved in podman-machine-default
Your public key has been saved in podman-machine-default.pub
The key fingerprint is:
SHA256:qAtpR6uPYqVLnEp3yOF2IGg4xaCZyelQjxv6lY+wp2o root@WinT14
The key's randomart image is:
+--[ED25519 256]--+
|. .              |
|oB.o             |
|*o= .            |
|=o o . .         |
|=+oo+ . S        |
|oo=B+=           |
| =O**o.          |
|=EoBo.           |
|=+=.o            |
+----[SHA256]-----+
Machine init complete
To start your machine run:

        podman machine start

PS> podman machine start
Starting machine "podman-machine-default"

This machine is currently configured in rootless mode. If your containers
require root permissions (e.g. ports < 1024), or if you run into compatibility
issues with non-podman clients, you can switch using the following command:

        podman machine set --rootful

API forwarding listening on: npipe:////./pipe/docker_engine

Docker API clients default to this address. You do not need to set DOCKER_HOST.
Machine "podman-machine-default" started successfully

PS> podman ps
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

vrothberg commented 1 year ago

@n1hility WDYT?

n1hility commented 1 year ago

My leading theory as to what caused this is that the vhdx file for the WSL distro that is backing the WSL distro was manually deleted (e.g. deleting files under .local\share\containers\podman\machine\wsl\wsldist)

The code being triggerd is a migration path for an old version file, and its trying to fix up the WSL distro, with a command that should not fail. The error returned "The system cannot find the path specified." would happen if it cant find the ext4 vhdx file to mount.

The fact that an old version file is involved makes me think this was an old machine instance that was partially cleaned up.

The only way around this situation is to podman machine rm as @gbraad shows. I am going to close this assuming this is the case, but feel free to reopen if this isnt and we can perform additional diagnostics.