docker-library / postgres

Docker Official Image packaging for Postgres
http://www.postgresql.org
MIT License
2.16k stars 1.13k forks source link

Kill a postgres container immediately #1261

Open drewwells opened 1 month ago

drewwells commented 1 month ago

I'm spinning up postgres containers for unit tests, so I don't care at all if they are corrupted on exit. I have been unable to find any way to kill these in less than 10 seconds.

Here is what I have tried and has not worked.

docker rm -f {pid} 10 seconds docker kill --signal SIGQUIT {pid} 10 seconds. Actually SIGQUIT will give a permission error here docker stop -t 1 --sginal SIGQUIT {pid} 10 seconds. Is this timeout even respected?

I see these errors in the docker daemon:

level=error msg="Container failed to exit within 10s of kill - trying direct SIGKILL" error="context deadline exceeded"

How do I terminate these containers immediately?

LaurentGoderre commented 1 month ago

A simple docker kill should work

% docker run -d --rm --name postgres-kill -e POSTGRES_PASSWORD=1234 postgres 
bffde2d477d889715f4344d90e62b0bad023ac94b6bd1ea6a262814340d86dfd
% docker kill postgres-kill                        
postgres-kill

or

% docker run -d --rm -e POSTGRES_PASSWORD=1234 postgres                        
5eb35e90b2cdd5c5245765032351a4ab3c21c40c96cea150d0c4c3178ea6f280
% docker kill 5eb35e90b2cdd5c5245765032351a4ab3c21c40c96cea150d0c4c3178ea6f280
drewwells commented 1 month ago

Did you time it? It takes 10 seconds and that error appears in docker daemon. It seems like the image is trapping the signal and not passing it along to postgres

LaurentGoderre commented 1 month ago

I am not using a signal, just forcing the container to be destroyed

% docker run -d --rm --name postgres-kill -e POSTGRES_PASSWORD=1234 postgres 
time docker kill postgres-kill
aee271f8ce85eff0b80bdb64ba4db83a1571c17bd963b664ca6365dc8296cf8f
postgres-kill
docker kill postgres-kill  0.01s user 0.01s system 17% cpu 0.091 total
drewwells commented 1 month ago

Something else is going on here, b/c that takes 10 seconds for me and for Jenkins

time docker kill $(docker run -d --rm -e POSTGRES_PASSWORD=1234 postgres)                                
da2e3db6490ee63454d3f92862c4ef6548467db2adab9ff76ad9e517735a5257
docker kill $(docker run -d --rm -e POSTGRES_PASSWORD=1234 postgres)  0.00s user 0.01s system 0% cpu 10.223 total
tianon commented 1 month ago

My best guess would be the recent AppArmor snafu where Docker itself got denied from sending signals to runc.

Guiorgy commented 1 day ago

10 seconds is the default --stop-timeout, maybe try setting it to a low value? For example:

docker run -d --rm --stop-timeout 1 --name postgres -e POSTGRES_PASSWORD=1234 postgres
drewwells commented 1 day ago

The stop-timeout flag is ignored here

/usr/bin/docker run --stop-timeout 1 --stop-signal SIGQUIT -e POSTGRES_PASSWORD=passyBAucNQ -e POSTGRES_USER=userNFX2wAo -e POSTGRES_DB=testdb bitnami/postgresql:latest
time docker kill 7c0a6
7c0a6
docker kill 7c0a6  0.01s user 0.00s system 0% cpu 10.225 total

Daemon logs

Sep 06 06:55:05 amazon dockerd[3878]: time="2024-09-06T06:55:05.568721579-05:00" level=error msg="
Container failed to exit within 10s of kill - trying direct SIGKILL" container=7c0a6b98107f4cc9ff59f5132c584fa9947bdb11ea0efdd82019e7726dca3e12 error="context deadline exceeded"
Guiorgy commented 1 day ago

The stop-timeout flag is ignored here

Unless you have an ancient Docker engine, I doubt that it doesn't support it:

Option Default Description
--stop-timeout   API 1.25+ Timeout (in seconds) to stop a container

If that's being ignored, how do we know that the rest, for example --stop-signal, is not being ignored?

But most importantly, what you seem to be using is the bitnami/postgresql image, where's this is the postgres image repo. Maybe try asking there for help?

drewwells commented 1 day ago

Yeah stop timeout has been around for a long time, so I don't think its my daemon being out of date. Running latest ubuntu. Here's the same issue with this postgres container.

I wouldn't be opposed to some daemon flags to fix this, Im just not clear why the kill command is being ignored.

/usr/bin/docker run --stop-timeout 1 --stop-signal SIGQUIT -e POSTGRES_PASSWORD=passyBAucNQ -e POSTGRES_USER=userNFX2wAo -e POSTGRES_DB=testdb postgres:latest
time docker kill b84f8db782bc
b84f8db782bc
docker kill b84f8db782bc  0.01s user 0.00s system 0% cpu 10.264 total

Here's my server version. I've been able to reproduce this on OS X and linux

Server:
 Engine:
  Version:          24.0.7
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.22.2
  Git commit:       24.0.7-0ubuntu4.1
  Built:            Fri Aug  9 02:33:20 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.12
  GitCommit:
 runc:
  Version:          1.1.12-0ubuntu3.1
  GitCommit:
 docker-init:
  Version:          0.19.0
  GitCommit:
tianon commented 1 day ago

--stop-timeout only applies to docker stop -- docker kill bypasses that completely, and what you're seeing is something else entirely (again, my best guess is AppArmor blocking sending the signal to runc, but it's hard to say for sure -- perhaps dmesg has more useful clues?)

tianon commented 1 day ago

See https://github.com/moby/moby/blob/96898c8be6103ba65f3096782cb9bca65701bf1c/daemon/kill.go#L161 for where that 10s docker kill timeout is hard-coded into the daemon itself (specifically because this is an operation that shouldn't fail unless something is going horribly wrong, like AppArmor blocking the signal from dockerd/containerd to runc completely).

drewwells commented 23 hours ago

I think you're right about apparmor, I see a bunch of references to postgres getting a docker-default apparmor policy. Ill try to leave this open until I figure out how to disable apparmor blocking SIGINT,SIGKILL,etc then close this ticket.

tianon commented 23 hours ago

See https://github.com/moby/moby/pull/47749 for links to the specific issue I'm talking about