containers / image

Work with containers' images
Apache License 2.0
867 stars 379 forks source link

Multiple simultaneous podman login commands can lose entries #1365

Open LHCGreg opened 3 years ago

LHCGreg commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

If multiple podman login commands are executed simultaneously, you can get the error

* reading JSON file "/run/containers/0/auth.json": unmarshaling JSON at "/run/containers/0/auth.json": unexpected end of JSON input

Steps to reproduce the issue:

  1. Do a bunch of podman logins at once. Sorry, I don't have a nice script to reproduce this, this is just something I'm seeing in our system.

Describe the results you received:

Out of three login commands, 2 logins succeeded, and one got the error

* reading JSON file "/run/containers/0/auth.json": unmarshaling JSON at "/run/containers/0/auth.json": unexpected end of JSON input

Describe the results you expected: I expected multiple simultaneous logins to work, with one of them ultimately "winning" and having its login info in wherever it gets stored.

Additional information you deem important (e.g. issue happens only occasionally): This isn't a big deal for me. I can change our code so it only issues one login command. I just figured I'd file an issue.

Output of podman version:

Version:      3.3.0
API Version:  3.3.0
Go Version:   go1.16.7
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.22.3
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 2
  distribution:
    distribution: debian
    version: "10"
  eventLogger: file
  hostname: d23290797c41
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.10.25-linuxkit
  linkmode: dynamic
  memFree: 1877671936
  memTotal: 3137761280
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1.5-925d-dirty
      commit: 0d42f1109fd73548f44b01b3e84d04a279e99d2e
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/local/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.1.11
      commit: 368e69ccc074628d17a9bb9a35b8f4b9f74db4c6
      libslirp: 4.6.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.3.3
  swapFree: 370733056
  swapTotal: 536866816
  uptime: 9h 54m 55.35s (Approximately 0.38 days)
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'
      Version: |-
        fusermount3 version: 3.4.1
        fuse-overlayfs: version 1.5
        FUSE library version 3.4.1
        using FUSE kernel interface version 7.27
    overlay.mountopt: nodev
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: overlayfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 0
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.3.0
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.16.7
  OsArch: linux/amd64
  Version: 3.3.0

Package info (e.g. output of rpm -q podman or apt list podman):

root@d23290797c41:/opt/krispr# apt list podman
Listing... Done
podman/now 100:3.3.0-1 amd64 [installed,local]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.): Running podman inside another container in kubernetes.

rhatdan commented 3 years ago

@mtrmac looks like we need to lock the auth.json file while we are updating it.

mtrmac commented 3 years ago

Yeah, ioutil.WriteFile is not atomic (it’s O_TRUNC of the existing file + write). At least being atomic so that clients don’t see empty/partial files definitely makes sense; truly locking so that we don’t lose data on concurrent writes might be harder.

mtrmac commented 2 years ago

As of #1515 concurrent logins should not corrupt the file, but it’s possible that some of the concurrent updates will be lost.

So, if there are multiple podman login attempts with the same credentials for the same scope, it doesn’t matter which one wins and the credentials will be available; but with concurrent logins for different scopes, some of the login commands might succeed but data would not, ultimately, be available in auth.json.

See #1506 for implementation discussion about locking that could fix the multiple-scope issue.