kata-containers / agent

Kata Containers version 1.x agent (for version 2.x see https://github.com/kata-containers/kata-containers). Virtual Machine agent for hardware virtualized containers
https://katacontainers.io/
Apache License 2.0
241 stars 114 forks source link

running oci hook fail with "wait: no child processes" error sometimes when stop container #886

Closed wxx213 closed 3 years ago

wxx213 commented 3 years ago

Description of problem

Sometimes stop container fail because of running oci hook error, some of the log is:

Mar 12 19:34:09 kata[153264]: time="2021-03-12T19:34:09.214607876+08:00" level=warning msg="stop container failed" container=84d36a06934f4219af010c808bda2ff05d01dabc1885891e7afefe6b24062d0f error="rpc error: code = Unknown desc = error running hook: wait: no child processes, stdout: , stderr: "

Check the codes, the error "wait: no child processes" is from cmd.wait() function in kata agent when run poststop hook, this is usually caused by wait race.

Expected result

Running and waiting hook should not fail.

Actual result

Waiting hook error.

Further information

We use kata 1.12.0 version.

And I guess this case is caused by race between agent reaper routine (agentReaper.reap function in sandbox.signalHandlerLoop) and and container stopping routine.

The agent reaper routine will listen the signal SIGCHLD in sandbox.signalHandlerLoop: https://github.com/kata-containers/agent/blob/stable-1.12/reaper.go#L137 At the same time the container stopping routine will listen the signal SIGCHLD too in cmd.wait function after start hook: https://github.com/kata-containers/agent/blob/stable-1.12/vendor/github.com/opencontainers/runc/libcontainer/configs/config.go#L334

fidencio commented 3 years ago

This issue is being automatically closed as Kata Containers 1.x has now reached EOL (End of Life). This means it is no longer being maintained.

Important:

All users should switch to the latest Kata Containers 2.x release to ensure they are using a maintained release that contains the latest security fixes, performance improvements and new features.

This decision was discussed by the @kata-containers/architecture-committee and has been announced via the Kata Containers mailing list:

If you believe this issue still applies to Kata Containers 2.x, please open an issue against the Kata Containers 2.x repository, pointing to this one, providing details to allow us to migrate it.