containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.27k stars 2.37k forks source link

Podman machine on macos (M1) causes corrupt ~/.config/containers/auth.json #19215

Closed BlaineEXE closed 5 months ago

BlaineEXE commented 1 year ago

Issue Description

When running podman machine on MacOS (M1), I regularly get corrupt a corrupt ~/.config/containers/auth.json file.

I notice that there are several dirs mounted to the machine VM by default including /Users. If I remove the mount for /Users, the corruption no longer occurs. However, when doing so I notice that I can then no longer mount content from /Users/... into my containers.

I suspect that the podman process running in the VM may be modifying the auth file at the same time as the podman process on my local machine.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Create podman machine on macos with one or more logged in registries
  2. Stop/start the machine a number of times until corruption occurs

Describe the results you received

I get corruption of the ~/.config/containers/auth.json file. The corruption appears to be misplaced copies of login info from other parts of the file. For example, the last 5 lines of the file may be duplicated.

Describe the results you expected

I expect that the auth file should not be corrupted and should have a stable record of my logged in registries.

podman info output

❯ podman info
host:
  arch: arm64
  buildahVersion: 1.30.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 97.44
    systemPercent: 1.52
    userPercent: 1.04
  cpus: 1
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: coreos
    version: "38"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.3.11-200.fc38.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 1199640576
  memTotal: 2048483328
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.5-1.fc38.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.5
      commit: b6f80f766c9a89eb7b1440c0a70ab287434b17ed
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.aarch64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 18m 57.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 2561400832
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.5.1
  Built: 1685123899
  BuiltTime: Fri May 26 11:58:19 2023
  GitCommit: ""
  GoVersion: go1.20.4
  Os: linux
  OsArch: linux/arm64
  Version: 4.5.1


### Podman in a container

No

### Privileged Or Rootless

Rootless

### Upstream Latest Release

Yes

### Additional environment details

Podman machine on MacOS M1

### Additional information

I have been experiencing this issue for months and get around it by keeping a backup of the file that revert every time corruption occurs, which is very regular during my normal development.
Luap99 commented 1 year ago

@vrothberg PTAL

vrothberg commented 1 year ago

Thanks for filing the issue, @BlaineEXE!

I think mounting ~/.config/containers from the client into the machine is a general problem not limited to auth files. I vote for completely masking this path.

@ashley-cui @baude WDYT?

Luap99 commented 1 year ago

I don't think that is the problem, keep in mind /Users should not be accessed by the VM user at all. ~/.config in the VM should resolve to /var/home/core/.config as this is the actual $HOME for the core user AFAICT. So there should not be a conflict. If this is is really a write conflict between VM and host than I find it more likely that the remote client leaked the path to the server somehow.

vrothberg commented 1 year ago

If this is is really a write conflict between VM and host than I find it more likely that the remote client leaked the path to the server somehow.

That is pretty much what I meant. How could the path be leaked other than by mounting into the VM?

ashley-cui commented 1 year ago

I'd be okay with masking the path, but Paul is correct in that the Podman inside the machine should be using /var/home/core/.config.

BlaineEXE commented 1 year ago

I didn't see this issue reported by anyone else, which I find strange. It's possible that my machine just has the perfect timing race condition that is exposing it. But after the latest convo here, I tried looking in the podman configs I'm aware of to see if I set the config location manually in case that may be the corner case causing me to be the only one experiencing the issue. I don't see anything there though. Should I look into any specific files?

vrothberg commented 1 year ago

Thanks, @BlaineEXE ! I think we need to investigate a bit further.

vrothberg commented 1 year ago

@ashley-cui do you have cycles to look into it?

ashley-cui commented 1 year ago

I'll try to take a look today or tomorrow, I'll update on whether or not I can reproduce.

ashley-cui commented 1 year ago

Finally got around to it, ran machine for a couple of days and also started/stopped machine in a loop for an hour, but wasn't able to reproduce unfortunately.

BlaineEXE commented 1 year ago

@ashley-cui I have noticed that the CLI podman machine init seems to create the volumes by default, but podman desktop does not. That could affect your test if you don't use CLI.

Also, if this is a race condition, it's possible something about my system is more prone to the issue than others. I also have the machine use 2 CPUs and 3072 MB of memory, but that seems unlikely to be relevant.

ashley-cui commented 1 year ago

I tried reproducing with the CLI, unfortunately nothing came up

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 1 year ago

Since we were not able to reproduce and no additional info came, I am closing. Reopen if we can get a reproducer.

vrothberg commented 1 year ago

I am reopening as a corrupted credential file is serious. At the very least, we should do a code audit in case we're unable to reproduce. For reproducing, please make sure that ~/.config/containers/auth.json is populated (e.g., from a podman login) and has multiple entries.

Luap99 commented 5 months ago

Given we have no heard anything back I assume this works, also podman 5.0 uses applehv with virtiofs instead so maybe that fixed the issue with 9p or whatever caused it.