kata-containers / runtime

Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
https://katacontainers.io/
Apache License 2.0
2.1k stars 377 forks source link

kill: does kata container support "terminationGracePeriodSeconds" in pod definition of kubernetes? #3160

Closed abel-von closed 3 years ago

abel-von commented 3 years ago

hi, recently my customer reported an issue of our container service. the "terminationGracePeriodSeconds" definition does not work properly in kata container.

I checked the codes in kata, found that when we call kata-runtime kill with a SIGTERM, kata signal user process and then wait 10 seconds before sending SIGKILL, as this waiting will hold the sandbox lock, then the terminationGracePeriodSeconds has no use at all(even it is smaller than 10 seconds). the whole the graceful termination process is controlled by kata, with a 10 second timeout.

    if err := waitForShim(c.process.Pid); err != nil {
        // Force the container to be killed.
        if err := c.kill(syscall.SIGKILL, true); err != nil && !force {
            return err
        }

        // Wait for the end of container process. We expect this call
        // to succeed. Indeed, we have already given a second chance
        // to the container by trying to kill it with SIGKILL, there
        // is no reason to try to go further if we got an error.
        if err := waitForShim(c.process.Pid); err != nil && !force {
            return err
        }
    }

    // Force the container to be killed. For most of the cases, this
    // should not matter and it should return an error that will be
    // ignored.
    // But for the specific case where the shim has been SIGKILL'ed,
    // the container is still running inside the VM. And this is why
    // this signal will ensure the container will get killed to match
    // the state of the shim. This will allow the following call to
    // stopContainer() to succeed in such particular case.
    c.kill(syscall.SIGKILL, true)

my question is, does kata support this graceful termination with configurable timeout?

devimc commented 3 years ago

@abel-von this timeout is not configurable

abel-von commented 3 years ago

do we have plan to support it or is it supported in shimv2? @devimc

egernst commented 3 years ago

We should support this. Can you try to reproduce with shim-v2?

abel-von commented 3 years ago

I think shimv2 support this because the kill request is sent to agent directly, I reproduced with our own shimv2(reimplemented with rust), this is supported.

fidencio commented 3 years ago

This issue is being automatically closed as Kata Containers 1.x has now reached EOL (End of Life). This means it is no longer being maintained.

Important:

All users should switch to the latest Kata Containers 2.x release to ensure they are using a maintained release that contains the latest security fixes, performance improvements and new features.

This decision was discussed by the @kata-containers/architecture-committee and has been announced via the Kata Containers mailing list:

If you believe this issue still applies to Kata Containers 2.x, please open an issue against the Kata Containers 2.x repository, pointing to this one, providing details to allow us to migrate it.