hashicorp / nomad-driver-podman

A nomad task driver plugin for sandboxing workloads in podman containers
https://developer.hashicorp.com/nomad/plugins/drivers/podman
Mozilla Public License 2.0
224 stars 61 forks source link

Jobs are not correctly restarted on Podman 4.6.0 #277

Closed jonasdemoor closed 10 months ago

jonasdemoor commented 11 months ago

Description

We're running Nomad 1.5.3 with Podman 4.6.0 compiled from sources on Debian 11 (bullseye). I attached more details about our environment below.

This is how the state of Nomad and Podman looks like after starting the job:

$ nomad job status
ID            Type    Priority  Status   Submit Date
csi-nfs-node  system  50        running  2023-08-04T12:28:24+02:00

$ podman ps
CONTAINER ID  IMAGE                                         COMMAND               CREATED        STATUS        PORTS       NAMES
452b8b3dc9d0  registry.k8s.io/sig-storage/nfsplugin:v4.1.0  -v=5 --nodeid=nom...  6 seconds ago  Up 6 seconds              plugin-928c60c6-2c62-5e13-f10d-054ac74f16f4

$ journalctl -u podman
Aug 04 12:28:54 <redacted> podman[70819]: @ - - [04/Aug/2023:12:28:54 +0200] "GET /v1.0.0/libpod/containers/452b8b3dc9d04cfb4a41ee11f3590f46c75ce00d60fef9826340e6e577637ca3/stats?stream=false HTTP/1.1" 200 1112 "" "Go-http-client/1.1"
Aug 04 12:28:55 <redacted> podman[70819]: @ - - [04/Aug/2023:12:28:55 +0200] "GET /v1.0.0/libpod/containers/452b8b3dc9d04cfb4a41ee11f3590f46c75ce00d60fef9826340e6e577637ca3/stats?stream=false HTTP/1.1" 200 1112 "" "Go-http-client/1.1"

After stopping the container, Nomad lets Podman correctly stop the container, but doesn't gc/remove it. Even though the job is dead according to Nomad, it keeps trying to get stats for that container.

$ nomad job stop csi-nfs-node
==> 2023-08-04T12:30:15+02:00: Monitoring evaluation "21d85be1"
    2023-08-04T12:30:15+02:00: Evaluation triggered by job "csi-nfs-node"
    2023-08-04T12:30:16+02:00: Evaluation status changed: "pending" -> "complete"
==> 2023-08-04T12:30:16+02:00: Evaluation "21d85be1" finished with status "complete"
$ nomad job status           
ID            Type    Priority  Status          Submit Date
csi-nfs-node  system  50        dead (stopped)  2023-08-04T12:28:24+02:00

$ podman ps -a
CONTAINER ID  IMAGE                                         COMMAND               CREATED        STATUS                     PORTS       NAMES
452b8b3dc9d0  registry.k8s.io/sig-storage/nfsplugin:v4.1.0  -v=5 --nodeid=nom...  2 minutes ago  Exited (2) 32 seconds ago              plugin-928c60c6-2c62-5e13-f10d-054ac74f16f4

$ journalctl -u podman
Aug 04 12:30:55 <redacted> podman[70819]: time="2023-08-04T12:30:55+02:00" level=info msg="Request Failed(Internal Server Error): container is stopped"
Aug 04 12:30:55 <redacted> podman[70819]: http: superfluous response.WriteHeader call from github.com/gorilla/handlers.(*responseLogger).WriteHeader (handlers.go:61)
Aug 04 12:30:55 <redacted> podman[70819]: @ - - [04/Aug/2023:12:30:55 +0200] "GET /v1.0.0/libpod/containers/452b8b3dc9d04cfb4a41ee11f3590f46c75ce00d60fef9826340e6e577637ca3/stats?stream=false HTTP/1.1" 500 81 "" "Go-http-client/1.1"

When I try to start the Nomad job again, Nomad thinks everything is fine and the job is running, but the old Podman container is still there and in a stopped state. Normally, by default, the old container should have been removed on stopping the job and a new one should have been started by Podman when starting the job again.

$ nomad job status 
ID            Type    Priority  Status   Submit Date
csi-nfs-node  system  50        running  2023-08-04T12:32:58+02:00

$ podman ps -a
CONTAINER ID  IMAGE                                         COMMAND               CREATED        STATUS                    PORTS       NAMES
452b8b3dc9d0  registry.k8s.io/sig-storage/nfsplugin:v4.1.0  -v=5 --nodeid=nom...  5 minutes ago  Exited (2) 3 minutes ago              plugin-928c60c6-2c62-5e13-f10d-054ac74f16f4

$ journalctl -u podman
Aug 04 12:33:59 <redacted> podman[70819]: @ - - [04/Aug/2023:12:33:59 +0200] "GET /v1.0.0/libpod/containers/452b8b3dc9d04cfb4a41ee11f3590f46c75ce00d60fef9826340e6e577637ca3/stats?stream=false HTTP/1.1" 500 81 "" "Go-http-client/1.1"
Aug 04 12:34:00 <redacted> podman[70819]: time="2023-08-04T12:34:00+02:00" level=info msg="Request Failed(Internal Server Error): container is stopped"
Aug 04 12:34:00 <redacted> podman[70819]: http: superfluous response.WriteHeader call from github.com/gorilla/handlers.(*responseLogger).WriteHeader (handlers.go:61)

It takes a manual podman rm <stopped_container_id> to get Nomad back in the right state. Then, Podman starts a new container and the job is running fine.

Downgrading to Podman 4.5.1 solves the issue for us. With that version, the containers are successfully stopped + deleted, as illustrated by the journalctl output:

Aug 04 12:47:04 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:04 +0200] "POST /v1.0.0/libpod/containers/create HTTP/1.1" 201 88 "" "Go-http-client/1.1"
Aug 04 12:47:04 <redacted> podman[73176]: [INFO  netavark::firewall] Using iptables firewall driver
Aug 04 12:47:04 <redacted> podman[73176]: [INFO  netavark::network::netlink] Adding route (dest: 0.0.0.0/0 ,gw: 10.250.0.1, metric 100)
Aug 04 12:47:04 <redacted> podman[73147]: time="2023-08-04T12:47:04+02:00" level=info msg="Running conmon under slice nomad.slice and unitName libpod-conmon-3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7.scope"
Aug 04 12:47:04 <redacted> podman[73147]: time="2023-08-04T12:47:04+02:00" level=info msg="Got Conmon PID as 73220"
Aug 04 12:47:04 <redacted> podman[73147]: 2023-08-04 12:47:04.940996242 +0200 CEST m=+2.194329935 container init 3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7 (image=registry.k8s.io/sig-storage/nfsplugin:v4.1.0, name=plugin-12fa1a0f-94d1-aca5-111b-a5cdc99ec49d, revision=v4.1.0)
Aug 04 12:47:04 <redacted> podman[73147]: 2023-08-04 12:47:04.945554312 +0200 CEST m=+2.198888005 container start 3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7 (image=registry.k8s.io/sig-storage/nfsplugin:v4.1.0, name=plugin-12fa1a0f-94d1-aca5-111b-a5cdc99ec49d, revision=v4.1.0)
Aug 04 12:47:04 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:04 +0200] "POST /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/start HTTP/1.1" 204 0 "" "Go-http-client/1.1"
Aug 04 12:47:04 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:04 +0200] "POST /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/wait?condition=running&condition=exited HTTP/1.1" 200 3 "" "Go-http-client/1.1"
Aug 04 12:47:04 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:04 +0200] "GET /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/json HTTP/1.1" 200 9418 "" "Go-http-client/1.1"
Aug 04 12:47:04 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:04 +0200] "GET /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/stats?stream=false HTTP/1.1" 200 1109 "" "Go-http-client/1.1"
(...)
Aug 04 12:47:13 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:13 +0200] "GET /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/stats?stream=false HTTP/1.1" 200 1096 "" "Go-http-client/1.1"
Aug 04 12:47:14 <redacted> podman[73147]: 2023-08-04 12:47:14.188771585 +0200 CEST m=+11.442105289 container died 3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7 (image=registry.k8s.io/sig-storage/nfsplugin:v4.1.0, name=plugin-12fa1a0f-94d1-aca5-111b-a5cdc99ec49d, revision=v4.1.0)
Aug 04 12:47:14 <redacted> podman[73240]: [INFO  netavark::firewall] Using iptables firewall driver
Aug 04 12:47:14 <redacted> podman[73240]: [INFO  netavark::network::bridge] removing bridge podman0
Aug 04 12:47:14 <redacted> podman[73147]: 2023-08-04 12:47:14.316331562 +0200 CEST m=+11.569665261 container cleanup 3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7 (image=registry.k8s.io/sig-storage/nfsplugin:v4.1.0, name=plugin-12fa1a0f-94d1-aca5-111b-a5cdc99ec49d, revision=v4.1.0)
Aug 04 12:47:14 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:14 +0200] "POST /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/stop?timeout=5&ignore=true HTTP/1.1" 204 0 "" "Go-http-client/1.1"
Aug 04 12:47:14 <redacted> podman[73147]: time="2023-08-04T12:47:14+02:00" level=error msg="Unable to get cgroup path of container: cannot get cgroup path unless container 3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7 is running: container is stopped"
Aug 04 12:47:14 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:14 +0200] "GET /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/stats?stream=false HTTP/1.1" 200 0 "" "Go-http-client/1.1"
Aug 04 12:47:14 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:14 +0200] "GET /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7/json HTTP/1.1" 200 9054 "" "Go-http-client/1.1"
Aug 04 12:47:15 <redacted> podman[73147]: 2023-08-04 12:47:15.963654811 +0200 CEST m=+13.216988506 container remove 3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7 (image=registry.k8s.io/sig-storage/nfsplugin:v4.1.0, name=plugin-12fa1a0f-94d1-aca5-111b-a5cdc99ec49d, revision=v4.1.0)
Aug 04 12:47:15 <redacted> podman[73147]: @ - - [04/Aug/2023:12:47:15 +0200] "DELETE /v1.0.0/libpod/containers/3fc62d4459a09a46b11fd21c6555fde6347fa691d23a55390e4487c3db40b4e7?force=true&v=true HTTP/1.1" 200 76 "" "Go-http-client/1.1"

After some digging, I found that the stats API changed a bit between Podman 4.5.1 and 4.6.0. It used to return HTTP 200 with an empty response for stopped containers, but now it returns HTTP 200 with an error message:

$ curl -v --unix-socket /var/run/podman/podman.sock http://localhost/v1.0.0/libpod/containers/902c4d2a9b050ad614f5e2d5651d42e37ca5d0ca8c2914e1877279dc2dd542e9/stats?stream=false
*   Trying /var/run/podman/podman.sock:0...
* Connected to localhost (/run/podman/podman.sock) port 80 (#0)
> GET /v1.0.0/libpod/containers/902c4d2a9b050ad614f5e2d5651d42e37ca5d0ca8c2914e1877279dc2dd542e9/stats?stream=false HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Api-Version: 1.41
< Libpod-Api-Version: 4.6.0
< Server: Libpod/4.6.0 (linux)
< X-Reference-Id: 0xc0000cd190
< Date: Thu, 03 Aug 2023 10:22:32 GMT
< Transfer-Encoding: chunked
< 
{"cause":"container is stopped","message":"container is stopped","response":500}

I think the upstream Podman PR that introduced those changes is this one: https://github.com/containers/podman/pull/15867 What I think is happening, is that the Nomad driver thinks the container is running fine, since it doesn't handle that new stats API response, it only takes the old API behaviour into account: https://github.com/hashicorp/nomad-driver-podman/blob/main/api/container_stats.go#L46

I tried poking around in the api/container_stats.go code to try add a new case for this new Podman behaviour, but my Golang skills are unfortunately not very good. I however want to assist in anyway I can, feel free to let me know if I can provide additional information and/or testing.

Environment

inflatador commented 11 months ago

Thanks for the bug report! I'm running in to the same issue on Fedora 38. I upgraded to podman 4.6.0-1.fc38.x86_64 on August 1st and that's when I started seeing the issue.

sushain97 commented 11 months ago

I also ran into this! Thanks for taking the time to investigate and dropping a clear explanation of the issue @jonasdemoor. I've got a small fix that works for me. PR incoming.

Procsiab commented 11 months ago

I can report that I am experiencing the same problem in the following environment:

Moreover, on a Job version update, the previous allocation's container(s) cannot be stopped, and the new allocation stays pending forever

For this specific instance, this is what I see from the not-stopping allocation's status log:

Aug 07, '23 10:15:37 +0000 | Killing | Sent interrupt. Waiting 5s before force killing
Aug 07, '23 10:13:13 +0000 | Started | Task started by client
Procsiab commented 11 months ago

I compiled the nomad-driver-podman plugin from the latest commit of @sushain97 PR's branch and I can report that it fixes the issue I described in my latest message.