knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.58k stars 1.16k forks source link

pod.TerminationGracePeriodSeconds setting #15599

Open V2arK opened 3 weeks ago

V2arK commented 3 weeks ago

Describe the feature

Right now TerminationGracePeriodSeconds is set to rev.Spec.TimeoutSeconds

https://github.com/knative/serving/blob/main/pkg/reconciler/revision/resources/deploy.go#L304

However, rev.Spec.TimeoutSeconds also specifies the timeout for in-flight request.

I think these two values should be seperated, because in my project, I want to terminate deployment without graceful exit, but I want the timout for in-flight request to be as long as possible.

skonto commented 3 weeks ago

I want to terminate deployment without graceful exit, but I want the timout for in-flight request to be as long as possible.

Hi @V2arK, this was added years ago so there is a guarantee about connections not to be dropped during autoscaling. The knative autoscaler continuously makes decisions about the deployment scale and that may interrupt connections during pod shutdown. Could you elaborate on your use case, you don't care about failing requests?

V2arK commented 3 weeks ago

Hi @skonto, in my uses cases I just want to terminates the pods ASAP (maybe 3~5 seconds) when I triggers the termination, but not to change the timeout for requests (eg, LLM spits out response in minutes),