containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.92k stars 2.42k forks source link

Podman 5.3.0 win-sshproxy.tid: The system cannot find the file specified. (install via scoop) #24557

Open johnnykang opened 1 week ago

johnnykang commented 1 week ago

Issue Description

after upgrading to the Podman 5.3.0 (via scoop), when starting the podman machine, there is a message

API forwarding for Docker API clients is not available due to the following startup failures. The system cannot find the path specified.

it turned out that the c:\.local\share\containers\podman\machine\wsl\podman-machine-default\win-sshproxy.tid is not created.

Downgrading to Podman 5.2.5 , the message is gone.

Podman machine runs in WSL2.

Steps to reproduce the issue

As above in the description

Describe the results you received

As above in the description

Describe the results you expected

podman machine starts and API forwarding for Docker API clients on Windows machine works as expected without error message.

podman info output

Podman 5.3.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

glektarssza commented 1 week ago

Calling podman machine stop will also produce the following warning/error in this situation:

Could not stop API forwarding service (win-sshproxy.exe): open C:\Users\<username>\.local\share\containers\podman\machine\wsl\podman-machine-default\win-sshproxy.tid: The system cannot find the file specified.
baude commented 1 week ago

It helps when you provide podman info like the template recommends. In both cases, is it safe to assume WSL is being used?

baude commented 1 week ago

@l0rd thoughts ?

glektarssza commented 1 week ago

In my case, yes. My podman machine instance is running inside of WSL.

glektarssza commented 1 week ago

podman version output:

Client:       Podman Engine
Version:      5.3.0
API Version:  5.3.0
Go Version:   go1.23.3
Git Commit:   874bf2c301ecf0ba645f1bb45f81966cc755b7da
Built:        Wed Nov 13 06:19:59 2024
OS/Arch:      windows/amd64

Server:       Podman Engine
Version:      5.2.5
API Version:  5.2.5
Go Version:   go1.22.7
Built:        Thu Oct 24 18:00:00 2024
OS/Arch:      linux/amd64

podman info output:

host:
  arch: amd64
  buildahVersion: 1.37.5
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.12-2.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 99.64
    systemPercent: 0.25
    userPercent: 0.11
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: container
    version: "40"
  eventLogger: journald
  freeLocks: 2048
  hostname: GlekPC
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 5.15.167.4-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 16078487552
  memTotal: 16776019968
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-2.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.17-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.17
      commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240906.g6b38f07-1.fc40.x86_64
    version: |
      pasta 0^20240906.g6b38f07-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4294967296
  swapTotal: 4294967296
  uptime: 0h 5m 17.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/user/.local/share/containers/storage
  graphRootAllocated: 1081101176832
  graphRootUsed: 914518016
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.5
  Built: 1729814400
  BuiltTime: Thu Oct 24 18:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.7
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.5

podman machine info output:

host:
    arch: amd64
    currentmachine: podman-machine-default
    defaultmachine: podman-machine-default
    eventsdir: C:\Users\roblo\.local\share\containers\podman\podman
    machineconfigdir: C:\Users\roblo\.config\containers\podman\machine\wsl
    machineimagedir: C:\Users\roblo\.local\share\containers\podman\machine\wsl
    machinestate: Running
    numberofmachines: 1
    os: windows
    vmtype: wsl
version:
    apiversion: 5.3.0
    version: 5.3.0
    goversion: go1.23.3
    gitcommit: 874bf2c301ecf0ba645f1bb45f81966cc755b7da
    builttime: Wed Nov 13 06:19:59 2024
    built: 1731503999
    osarch: windows/amd64
    os: windows

podman machine inspect output:

[
     {
          "ConfigDir": {
               "Path": "C:\\Users\\roblo\\.config\\containers\\podman\\machine\\wsl"
          },
          "ConnectionInfo": {
               "PodmanSocket": null,
               "PodmanPipe": {
                    "Path": "\\\\.\\pipe\\podman-machine-default"
               }
          },
          "Created": "2024-11-13T14:54:20.3775309-07:00",
          "LastUp": "2024-11-13T14:55:18.3645367-07:00",
          "Name": "podman-machine-default",
          "Resources": {
               "CPUs": 16,
               "DiskSize": 100,
               "Memory": 2048,
               "USBs": []
          },
          "SSHConfig": {
               "IdentityPath": "C:\\Users\\roblo\\.local\\share\\containers\\podman\\machine\\machine",
               "Port": 50275,
               "RemoteUsername": "user"
          },
          "State": "running",
          "UserModeNetworking": false,
          "Rootful": false,
          "Rosetta": false
     }
]

Edit: Apologies, forgot I had downgraded back to 5.2.5 when I dumped all this information. It is now updated with the info dump from 5.3.0 when I was having this issue.

johnnykang commented 1 week ago

It helps when you provide podman info like the template recommends. In both cases, is it safe to assume WSL is being used?

Yup. Podman machine runs in the WSL2.

glektarssza commented 1 week ago

Also might be worth mentioning that my podman setup was installed via scoop. Shouldn't impact anything (I think) but you never know.

johnnykang commented 1 week ago

Also might be worth mentioning that my podman setup was installed via scoop. Shouldn't impact anything (I think) but you never know.

Great point. And that's my installation method (via scoop) as well.

l0rd commented 1 week ago

We were able to reproduce this issue with @jeffmaury. The problem was related to the %USERPROFILE%/.ssh/config. As a workaround we renamed the config file.

Can you please provide the output of the command podman info --log-level debug.

glektarssza commented 1 week ago

@l0rd Here you go:

time="2024-11-13T19:30:32-07:00" level=info msg="C:\\Users\\roblo\\scoop\\apps\\podman\\current\\podman.exe filtering at log level debug"
time="2024-11-13T19:30:32-07:00" level=debug msg="Called info.PersistentPreRunE(C:\\Users\\roblo\\scoop\\apps\\podman\\current\\podman.exe info --log-level debug)"
time="2024-11-13T19:30:32-07:00" level=debug msg="SSH Ident Key \"C:\\\\Users\\\\roblo\\\\.local\\\\share\\\\containers\\\\podman\\\\machine\\\\machine\" SHA256:DW4NTzZN7HgVBWBnF8rA6cmGd9LchILrC9GP0vnMU7Q ssh-ed25519"
time="2024-11-13T19:30:32-07:00" level=debug msg="DoRequest Method: GET URI: http://d/v5.3.0/libpod/_ping"
time="2024-11-13T19:30:32-07:00" level=debug msg="DoRequest Method: GET URI: http://d/v5.3.0/libpod/info"
host:
  arch: amd64
  buildahVersion: 1.37.5
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.12-2.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 99.47
    systemPercent: 0.35
    userPercent: 0.18
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: container
    version: "40"
  eventLogger: journald
  freeLocks: 2048
  hostname: GlekPC
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 5.15.167.4-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 16092475392
  memTotal: 16776024064
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-2.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.17-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.17
      commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240906.g6b38f07-1.fc40.x86_64
    version: |
      pasta 0^20240906.g6b38f07-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4294967296
  swapTotal: 4294967296
  uptime: 0h 6m 30.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/user/.local/share/containers/storage
  graphRootAllocated: 1081101176832
  graphRootUsed: 961994752
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.5
  Built: 1729814400
  BuiltTime: Thu Oct 24 18:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.7
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.5

time="2024-11-13T19:30:32-07:00" level=debug msg="Called info.PersistentPostRunE(C:\\Users\\roblo\\scoop\\apps\\podman\\current\\podman.exe info --log-level debug)"
time="2024-11-13T19:30:32-07:00" level=debug msg="Shutting down engines"
l0rd commented 1 week ago

Thank you. No we were not able to reproduce your error then. The problem was a different one related to the SSH config.

We will try to reproduce it using scoop.

johnnykang commented 1 week ago

It looks like if Podman is installed via winget, win-sshproxy works perfectly fine.

I suspect this issue is specific to podman installation via scoop.

glektarssza commented 1 week ago

I'll try installing via winget after work today to confirm this as well. If I cannot reproduce the issue in that environment I think it's a scoop-specific issue, yeah.

l0rd commented 1 week ago

Thank you @johnnykang for checking. And you are right, that should be the problem.

The version of gvproxy has been updated in 5.3 but the scoop installer extracts podman.exe, not gvproxy from the installer.

@baude any idea who maintains the podman scoop bucket? From git history it looks @niheaven is the last one that have updated it.

l0rd commented 1 week ago

@glektarssza @johnnykang out of curiosity: why are you installing via scoop rather than winget?

glektarssza commented 1 week ago

I've been a long time user of scoop so mainly because it's an ecosystem I'm familiar with and, at the time I started using it, had the software I was using.

I'm not opposed to changing but it would be a bit of a pain to migrate everything over at this point depending on what is and is not available in the winget ecosystem.

baude commented 1 week ago

i do not know who maintains scoop.

glektarssza commented 1 week ago

I believe it's maintained by the community who uses it via a collection of GitHub repositories under the https://github.com/orgs/ScoopInstaller organization. At least that's what their website (https://scoop.sh) seems to indicate down at the bottom.

glektarssza commented 1 week ago

Actually, I stand corrected. Their website links specifically to https://github.com/orgs/ScoopInstaller/people as the maintainers.

jeffmaury commented 1 week ago

I can reproduce after having installed Podman through scoop

johnnykang commented 1 week ago

due to the fact that this issue only occurs when Podman is installed via scoop, i would like to close this issue as i don't believe the team is responsible for fixing it.

closing.

glektarssza commented 1 week ago

I opened ScoopInstaller/Main#6327. Hopefully they can figure this out on their end.

niheaven commented 1 week ago

Hi all, Scoop uses bundled version of gvproxy.exe and win-sshproxy, which version comes with podman v5.3.0?

And in my podman installation, podman machine start doesn't output any error, but yes, podman machine stop does, even after I replaced existing gvproxy and win-sshproxy in the podman folder with the latest ones (v0.8.0).

johnnykang commented 1 week ago

And in my podman installation, podman machine start doesn't output any error

Mine output "The system cannot find the path specified." in the startup message. It is not an obvious message, but it warned me that the Docker API stopped working.

niheaven commented 1 week ago

Oh yes, same output, really not obvious :)

So which one should podman use? I'll try to install podman via winget and put all the files under scoop installed folder and see what will happened.

johnnykang commented 1 week ago

Oh yes, same output, really not obvious :)

So which one should podman use? I'll try to install podman via winget and put all the files under scoop installed folder and see what will happened.

Just do scoop uninstall podman and winget install podman.

all configurations should still in effective and ready to use with all your existing containers.

Try that at your own risk. It worked that that way on my machine ™️

niheaven commented 1 week ago

An error occurs when Podman is initiated from a Windows junction folder, as with Scoop's setup (where Podman starts from podman\current, a junction to podman\5.3.0). However, initiating Podman directly from the actual folder (podman\5.3.0) or from other locations like C:\, D:\, etc., works without issues.

Are there recent commits that change this behavior?

l0rd commented 1 week ago

@niheaven, we came to the same conclusion after doing some tests with @jeffmaury today. The workarounds we found when the provider is Hyper-V:

When Podman starts from a junction it fails to find the win-sshproxy.exe but it doesn't look that anything changed between 5.2 and 5.3 🤷

niheaven commented 1 week ago

Copy the *.exe files in %PROGRAMFILES%/RedHat/Podman (however, this requires admin privileges and may not be ideal)

Is it necessary? I've noticed that the executables are identical between Winget and Scoop.

Create a %APPDATA%\containers\containers.conf with the helper_binaries_dir that points to \podman\5.3.0:

I'll try this and it should be done during Scoop installation.

l0rd commented 1 week ago

Is it necessary? I've noticed that the executables are identical between Winget and Scoop.

No, I don't think that's necessary. If you create the configuration file, you don't need to move the files there.

I'll try this and it should be done during Scoop installation.

Right, I think that's the best solution at the moment. We should figure out the root cause of the problem and eventually provide a fix (in Podman), but in the meantime, I think it's better to create the configuration file.

niheaven commented 6 days ago

We should figure out the root cause of the problem and eventually provide a fix (in Podman)

Okay, let's wait for the fix, but for now, a hotfix in Scoop's manifest works quite well.

niheaven commented 6 days ago

Another question. If my scoop installation is at D:\Scoop\apps\podman\5.3.0, the config entry helper_binaries_dir=["<scoop-app-dir>\podman\5.3.0"] should be which format?

Ping @l0rd

cpfeiffer commented 5 days ago

@johnnykang I think scoop is just the messenger here. There was a change in go 1.23 changing the behavior of filePath.EvalSymlinks() that may be related (podman 5.2 still used go 1.22).

specter119 commented 5 days ago

@niheaven, we came to the same conclusion after doing some tests with @jeffmaury today. The workarounds we found:

  • Run Podman from <scoop-app-dir>\podman\5.3.0 as you mentioned above
  • Create a %APPDATA%\containers\containers.conf with the helper_binaries_dir that points to <scoop-app-dir>\podman\5.3.0:

[engine] helper_binaries_dir=["\podman\5.3.0"]

  • Copy the *.exe files in %PROGRAMFILES%/RedHat/Podman (however, this requires admin privileges and may not be ideal)

When Podman starts from a junction it fails to find the win-sshproxy.exe but it doesn't look that anything changed between 5.2 and 5.3 🤷

I have a self-maintained scoop bucket, with podman

https://github.com/specter119/scoop-dsms/blob/main/bucket/podman.yml

and tried 2 following solutions, neither works:

I have fixed the issue in the official repo (like only extract podman.exe ) and tried the above approaches.

It seems that there are some remaining issues in the podman side, at least a proper config different with previous.

l0rd commented 2 days ago

A few more informations.

Indeed after this change introduced in go v1.23, filepath.EvalSymlinks(path) returns an error if path is within a junction (as in the case of podman.exe installed via scoop). The fact that we are not checking if path is a symlink before calling EvalSymlinks is a bug in Podman.

The good news is that setting the env $env:GODEBUG="winsymlink=0" when running Podman is sufficient to make it work as before. So that looks the simplest workaround @niheaven.

Othewise, the workarounds I have mentioned above, and in particular configuring helper_binaries_dir, work if the provider is Hyper-V. They don't when it's WSL. And WSL is the default provider and switching from WSL to HyperV can be problematic (a WSL machine cannot be converted to an HyperV machine). Anyway this is the %APPDATA%\containers\containers.conf I have tested with.

[engine]
helper_binaries_dir=["C:\\Users\\<username>\\scoop\\apps\\podman\\5.3.0\\"]

[machine]
provider="hyperv"

For WSL, a workaround is to change PATH so that C:\\Users\\<username>\\scoop\\apps\\podman\\5.3.0\\ comes first:

$env:PATH="C:\Users\<username>\scoop\apps\podman\5.3.0\;$env:PATH"

but setting $env:GODEBUG="winsymlink=0" as mentioned above is probably simpler and works for both WSL and HyperV.

l0rd commented 2 days ago

I am re-opening this issue and assigning it to myself. I have a fix here (this works too) but there are still other uses of filepath.EvalSymlinks in Podman codebase that should be reviewed and likely updated.

niheaven commented 2 days ago

Since I'm unsure whether other applications utilize winsymlink in Go, I implemented a temporary fix for Scoop manifest by adding the versioned directory to the PATH, which seems to be working well.