knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.57k stars 1.16k forks source link

scale-to-zero-pod-retention-period max value not documented #13725

Open bjornrydahl opened 1 year ago

bjornrydahl commented 1 year ago

Ask your question here:

I'm trying to set the scale-to-zero-pod-retention-period documented here: https://knative.dev/docs/serving/autoscaling/scale-to-zero/#scale-to-zero-grace-period to something higher than 1h. It appears that there is a max value of 1h, but in the docs it says "Possible values: Non-negative duration string"

admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: expected 0s <= 2h <= 1h0m0s: spec.template.metadata.annotations.autoscaling.knative.dev/scale-to-zero-pod-retention-period

I would like to set something higher than 1h, and would like to know if there is a reason why max is 1h or if there is a workaround I can use.

Thank you!

ReToCode commented 1 year ago

I don't have the answer to your question, maybe @dprotaso or @psschwei have an idea?

dprotaso commented 1 year ago

Looks like the 1h is hard coded and we bound some of these annotations to this max window. https://github.com/knative/serving/blob/f031fd4e16e23c404ba2531b71f23d289a74889a/pkg/apis/autoscaling/register.go#L144-L146

Oddly though - the bound only applies when it's an annotation on the revision - but not the global setting in the config map.

I'm not sure what's right and wrong here - so this requires a deeper dive

/triage accepted

dprotaso commented 1 year ago

Paul found the original feature track document and it mentions

There is no initial maximum possible value for the flag, but in the end we might want to clamp it, say with one hour limit (as we do for stable window flag).

dprotaso commented 1 year ago

So @bjornrydahl you sorta have a work around by setting a global value for all revisions.

Otherwise we can relax the max duration constraint but I'm wondering what durations you've tried to use in the past?

bjornrydahl commented 1 year ago

@dprotaso Thank you for the input here, and I apologize for not getting back to this earlier. I will try using the config instead.

We intended to have it longer to see if we could scale to 0 after 24h.

If that is a bad idea, feel free to let me know.

dprotaso commented 1 year ago

Curious - do you prefer to have this as an operator (global) setting for all revisions or is this something you prefer to customize on a per revision basis?

desimonemike123 commented 2 weeks ago

@dprotaso @psschwei In our cluster deployments I have the same requirement that @bjornrydahl has/had. For some workloads (models) I need to specify a scale-to-zero-pod-retention-period of greater than 1 hour.
Since this issue is still open am I right to assume:

UPDATE: After doing some testing, my assumptions were correct :-)

Thanks for the guidance.