containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.47k stars 2.38k forks source link

--health-start-period option does not seem to do anything, or is underdocumented #19272

Closed martinpitt closed 10 months ago

martinpitt commented 1 year ago

Issue Description

This was originally part of #19237, but split out by @mheon's request.

When running a container with a successful health check:

podman run --rm --health-cmd=true --health-start-period=5s --health-startup-timeout=1 --health-retries=1  quay.io/prometheus/busybox sleep infinity

and monitoring it with podman events and watch -n1 podman ps -a, it seems to run the health check immediately:

2023-07-18 08:05:40.819456922 +0000 UTC container start 508ea01a5b8b988160c5f9ca33f29c81a5d8f9cf2cbf86a57b99868edd309ff2 (image=quay.io/prometheus/busybox:latest, name=focused_albattani)
2023-07-18 08:05:40.819810711 +0000 UTC container attach 508ea01a5b8b988160c5f9ca33f29c81a5d8f9cf2cbf86a57b99868edd309ff2 (image=quay.io/prometheus/busybox:latest, name=focused_albattani)
2023-07-18 08:05:40.95639024 +0000 UTC container exec_died 508ea01a5b8b988160c5f9ca33f29c81a5d8f9cf2cbf86a57b99868edd309ff2 (image=quay.io/prometheus/busybox:latest, name=focused_albattani)
2023-07-18 08:05:40.966479589 +0000 UTC container health_status 508ea01a5b8b988160c5f9ca33f29c81a5d8f9cf2cbf86a57b99868edd309ff2 (image=quay.io/prometheus/busybox:latest, name=focused_albattani, health_status=healthy)

(note the time stamps), and podman ps also shows "Up 1 seconds (healthy)".

With a failing health check (--health-cmd=false) it looks a little different. It also immediately runs the health check command (exec_died events), but claims "starting" after the first run:

2023-07-18 08:07:07.838615245 +0000 UTC container start 11ac0127f2dc3e499b201c48c3f314631bfe9f1bead9273dad1c6c115639ae90 (image=quay.io/prometheus/busybox:latest, name=laughing_easley)
2023-07-18 08:07:07.838828519 +0000 UTC container attach 11ac0127f2dc3e499b201c48c3f314631bfe9f1bead9273dad1c6c115639ae90 (image=quay.io/prometheus/busybox:latest, name=laughing_easley)
2023-07-18 08:07:07.765782475 +0000 UTC image pull d5b913ef2ba287545f5972e823a9703425ae7675781362d2906d705ad63eff68 quay.io/prometheus/busybox
2023-07-18 08:07:07.932257195 +0000 UTC container exec_died 11ac0127f2dc3e499b201c48c3f314631bfe9f1bead9273dad1c6c115639ae90 (image=quay.io/prometheus/busybox:latest, name=laughing_easley)
2023-07-18 08:07:07.938765361 +0000 UTC container health_status 11ac0127f2dc3e499b201c48c3f314631bfe9f1bead9273dad1c6c115639ae90 (image=quay.io/prometheus/busybox:latest, name=laughing_easley, health_status=starting)

and only 30s later it moves to "unhealthy":

2023-07-18 08:07:38.374269989 +0000 UTC container exec_died 11ac0127f2dc3e499b201c48c3f314631bfe9f1bead9273dad1c6c115639ae90 (image=quay.io/prometheus/busybox:latest, name=laughing_easley)
2023-07-18 08:07:38.387015929 +0000 UTC container health_status 11ac0127f2dc3e499b201c48c3f314631bfe9f1bead9273dad1c6c115639ae90 (image=quay.io/prometheus/busybox:latest, name=laughing_easley, health_status=unhealthy)

It looks exactly the same without --health-start-period=5s, so it seems that option has no observable effect.

Steps to reproduce the issue

see description

Describe the results you received

Health check command runs immediately after container starts

Describe the results you expected

Health check command runs after the given time span with --health-start-period.

Also, this option seems to be named unfortunately. "period" suggests something repeating. But it should only have an affect once, after container startup. The period is --health-startup-interval.

The manpage says

The initialization time needed for a container to bootstrap. The value can be expressed in time format like 2m3s. The default value is 0s.

It does not directly say that it will delay the first health check, but what else would it do?

podman info output

host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 93.55
    systemPercent: 2.93
    userPercent: 3.53
  cpus: 1
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: cloud
    version: "38"
  eventLogger: journald
  hostname: fedora-38-127-0-0-2-2201
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.3.12-200.fc38.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 326119424
  memTotal: 1123942400
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.5-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.5
      commit: b6f80f766c9a89eb7b1440c0a70ab287434b17ed
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 1078194176
  swapTotal: 1123020800
  uptime: 1h 46m 33.00s (Approximately 0.04 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/admin/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/admin/.local/share/containers/storage
  graphRootAllocated: 12798898176
  graphRootUsed: 2206924800
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /tmp/containers-user-1000/containers
  transientStore: false
  volumePath: /home/admin/.local/share/containers/storage/volumes
version:
  APIVersion: 4.5.1
  Built: 1685123928
  BuiltTime: Fri May 26 17:58:48 2023
  GitCommit: ""
  GoVersion: go1.20.4
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Happens on bare metal and in VM, it also happens as root and user.

Additional information

No response

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

martinpitt commented 1 year ago

bump

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

Luap99 commented 12 months ago

I think @mheon looked at this recently? Did you came to any conclusion?

The fact that we trigger the healthcheck immediately seems wrong to me and I would expect to wait for at least the interval an do know how we can fix that. Now if I read docs about --health-start-period for docker that seems to suggest that the healthcheck is not actually delayed but rather that only all failures before this time are ignored.

mheon commented 12 months ago

I confirmed that our handling of start period is correct (we do ignore failures within the accepted number of seconds), but I am still not 100% sure that we're matching Docker here - I think they may be displaying that the HC is starting, even if it's successful? Need to double-check the code over there.

umohnani8 commented 10 months ago

Hi @martinpitt, I looked into this and the reason you are only seeing an unhealthy status update after 30 seconds is because if the user doesn't set --health-interval, we default it to 30 seconds. So what is happening in your case is that, the healthcheck command runs as soon as the container comes up, it sees that is unhealthy but it is still in the start-period time so it doesn't update the status because the start-period time is the grace period for the container to fully bootstrap. But since you haven't specifies the --health-interval, the next time the healthcheck command is run is after 30 seconds because that is the default value.

I would recommend setting the --health-interval value to something like 2 seconds so that the healtcheck command is fired every 2 seconds for your use case. I tested it out and it works as expected:

podman run -d --rm --health-cmd=false --health-start-period=5s --health-retries=1 --health-interval=2s alpine sleep 1000

➜  podman git:(healthcheck) ✗ podman ps
CONTAINER ID  IMAGE                            COMMAND     CREATED        STATUS                   PORTS       NAMES
bbcfad6b4fc4  docker.io/library/alpine:latest  sleep 1000  4 seconds ago  Up 4 seconds (starting)              relaxed_boyd
➜  podman git:(healthcheck) ✗ podman ps
CONTAINER ID  IMAGE                            COMMAND     CREATED        STATUS                   PORTS       NAMES
bbcfad6b4fc4  docker.io/library/alpine:latest  sleep 1000  5 seconds ago  Up 5 seconds (starting)              relaxed_boyd
➜  podman git:(healthcheck) ✗ podman ps
CONTAINER ID  IMAGE                            COMMAND     CREATED        STATUS                    PORTS       NAMES
bbcfad6b4fc4  docker.io/library/alpine:latest  sleep 1000  6 seconds ago  Up 6 seconds (unhealthy)              relaxed_boyd

If the healtcheck command returns healthy, then we update the health status to healthy immediately ignoring the start-period time so that service managers don't have to wait the full grace period given to a container to bootstrap. That is why you are seeing a healthy state almost immediately for a health check that succeeds. This matches docker behavior also.

This probably needs to be clarified in the docs, I will open a PR to update the docs.

martinpitt commented 10 months ago

@umohnani8 I'm aware of the 30s default of --health-interval. That seems to work fine (see the part that covers an unhealthy run). The bug report is about --health-start-period: It doesn't have an observable effect if the health-cmd succeeds, both podman ps and podman events immediately claim "healthy" -- it should either be "starting", or preferably, not run the health-check command at all until the start-period has passed. For a failing health check, it still runs the command immediately (instead of postponing it), but at least it results in "starting" instead of an immediate "unhealthy".

So it's inconsistent, it's not clear what it actually does, and IMHO it's also badly named (of course this depends on what it's actually supposed to do, but I can't see how it is a period).

Luap99 commented 10 months ago

@martinpitt As mentioned above within that period all healthcheck failures are ignored so it does seems to work and this is what docker does as well. That does not replace the interval in which healtcheckes are executed. Given your interval is still 30 seconds the second healthcheck is executed after 30s and only then it can turn into the failed state.

We should expand the docs on that option but I think the code works as intended.