temp server maybe become a zombie process

docker-library / mysql

Docker Official Image packaging for MySQL Community Server

https://dev.mysql.com/

GNU General Public License v2.0

2.46k stars 2.19k forks source link

temp server maybe become a zombie process #985

Open Lxuty opened 1 year ago

Lxuty commented 1 year ago

In my environment, I found the temp mysql server become a zombie process，like this:

UID PID PPID C STIME TTY TIME CMD mysql 1 0 0 Jul03 ? 00:05:47 mysqld --defaults-file=/etc/my.cnf mysql 15 1 0 Jul03 ? 00:00:00 [mysqld]

I suspect when mysqldump shutdown mysql with socket ends，but the mysqld not exits completed。

when the second mysqld started，the first mysqld process becomes zombie process；

Is this possible?

tianon commented 1 year ago

I've definitely not seen this before, because at the time the temporary server is stopped, PID1 is the shell, which should happily reap this child process. :sweat_smile:

Do you have a simple reproducer we can use to verify? I can't reproduce:

$ docker run -dit --name test --rm --env MYSQL_ROOT_PASSWORD=bad-example mysql
Unable to find image 'mysql:latest' locally
latest: Pulling from library/mysql
e2c03c89dcad: Pull complete 
68eb43837bf8: Pull complete 
796892ddf5ac: Pull complete 
6bca45eb31e1: Pull complete 
ebb53bc0dcca: Pull complete 
2e2c6bdc7a40: Pull complete 
6f27b5c76970: Pull complete 
438533a24810: Pull complete 
e5bdf19985e0: Pull complete 
667fa148337b: Pull complete 
5baa702110e4: Pull complete 
Digest: sha256:232936eb036d444045da2b87a90d48241c60b68b376caf509051cb6cffea6fdc
Status: Downloaded newer image for mysql:latest
14e43f1a4fc1f5c05ca53e845ed0973aa1131ae3649fee914b9af638310a7001

$ docker logs --tail=2 test
2023-07-11T22:28:35.503922Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock
2023-07-11T22:28:35.503942Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.33'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server - GPL.

$ docker top test
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
systemd+            1848134             1848113             3                   15:28               pts/0               00:00:00            mysqld

Lxuty commented 1 year ago

I also saw this problem for the first time，This is very difficult to reproduce。

I view the mysqladmin shutdown source code , found that returns when pidfile is disapper.

and I also view the mysql source code about shutdown process ，that first delete pidfile，second release some other resource ，finally exits process and release mysql pid。

In fact，There exist a time gap between mysqladmin shutdown returns and mysql exits completed。

For 100% reproduction，I add sleep(5) in mysql source code which between delete pidfile and mysql exit, like this:

when I use new mysql compile version start docker ,the zombie process appear

tianon commented 9 months ago

Oh, interesting, I think I understand! We start mysqld, then ask mysqladmin to shut it down, and it for some reason uses one of the things mysqld does during clean_up as a signal that mysqld is dead (even if the PID is still running), and we manage to invoke the "real" mysqld before the temporary mysqld exits, and thus it becomes a zombie child of the new PID1. :sob:

I guess this is probably a bug in mysqladmin, right? It should probably be verifying the process from the pidfile actually stopped? Maybe we could add a basic workaround by reading the PID before we try to stop and just adding something basic like wait "$pid" after we ask mysqladmin to shut down?