knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.54k stars 1.15k forks source link

Investigate the use of CRD sub-resources. #643

Closed mattmoor closed 5 years ago

mattmoor commented 6 years ago

IIRC K8s 1.10 added support for "status" sub-resources in CRDs.

There are a variety of places this is useful to us, including updateStatus in each of the controllers, and (likely) the validation logic in the webhook.

Let's scout this functionality in M4, and if useful adopt.

See also this issue, which tracks a 1.10 update.

grantr commented 6 years ago

I also think the Scale sub-resource would be a useful addition to revisions as a standard way for autoscalers to effect scaling. This gives the revision controller the option of modifying or ignoring the request, unlike the alternative of autoscaling the deployment directly.

mattmoor commented 6 years ago

Yes perhaps, although I'm not sure the form that takes is appropriate for us?

I think the payload is a single integer (foggy recollection of things people told me), which feels like the kind of thing we have been avoiding in the spec, since that feels like "how many servers" and we're after "serverless".

That said, I could see this being useful to force Reserve -> Active with some minimum "concurrent request" capacity in anticipation of load spikes beyond our capacity to hyperscale up (e.g. if more cluster capacity is needed).

cc @josephburnett for this discussion, but let's track that with a separate issue since it involves a bit of design and specification.

grantr commented 6 years ago

It's less about users and more about implementors of autoscaling strategies. But you're right, off-topic, plus @josephburnett is out until Tuesday and I don't want to start any official discussions without him. :smiley: This document in the team drive has more details if anyone is interested.

mattmoor commented 6 years ago

/assign @grantr

mattmoor commented 6 years ago

Moving to M5, since the GKE 1.10 alpha clusters have been showing some issues. I don't think the importance of this has diminished, but I'm less optimistic that we'll be able to make the switch to 1.10 smoothly in M4.

mattmoor commented 6 years ago

/assign @dprotaso @bsnchan

Since they have a PR. I believe this is blocked on 1.10, which is coming in hot for M5 (it is GA in GKE, but some of our monitoring stuff has issues with it). I'm not particularly optimistic that this will land.

sslavic commented 6 years ago

If I understood correctly, CRD with subresources is in v1beta1 in k8s 1.11 https://kubernetes.io/blog/2018/06/27/kubernetes-1.11-release-announcement/

mattmoor commented 5 years ago

@dprotaso @bsnchan This should be unblocked. GKE has 1.11, so we can actually test with sub-resources 🎉

dprotaso commented 5 years ago

Here's a backwards compatible strategy for adopting CRD status subresource with the aim of eventually dropping the generation bumping that's occurring in the webhook.

Phase 1 - Serving 0.3

Changes

Enable status subresource on all CRDs

Drop our functional dependency on spec.generation

Implications

Phase 2 - Serving 0.4

Changes

Fully drop our dependency on spec.generation

Remove Spec.Generation from the API (under consideration)

Change the label applied to revisions from configurationMetadataGeneration back to configurationGeneration

Implications

Phase 3 - Serving 0.5

Changes

Drop functional dependency on configurationMetadataGeneration

Phase 4 - Serving 0.6

Changes

No longer apply the configurationMetadataGeneration label to revisions

Phase X - when conversion webhooks are beta

Changes

Implications

mattmoor commented 5 years ago

Moving to 0.4 for the next phase of the work.

/milestone Serving 0.4

knative-prow-robot commented 5 years ago

@mattmoor: The provided milestone is not valid for this repository. Milestones in this repository: [Ice Box, Needs Triage, Serving "v1" (ready for production), Serving 0.3, Serving 0.4]

Use /milestone clear to clear the milestone.

In response to [this](https://github.com/knative/serving/issues/643#issuecomment-451501362): >Moving to 0.4 for the next phase of the work. > >/milestone 0.4 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
mattmoor commented 5 years ago

/milestone Serving 0.4

mattmoor commented 5 years ago

0.4 pieces have landed, moving out.

mattmoor commented 5 years ago

/milestone Serving 0.6 PRs for 0.5 scope are in 0.5 now, so moving this to 0.6 to track remaining work.

mattmoor commented 5 years ago

@dprotaso Do you plan to do the 0.6 portion of this?

dprotaso commented 5 years ago

@dprotaso Do you plan to do the 0.6 portion of this?

Oh yeah

dprotaso commented 5 years ago

/close

knative-prow-robot commented 5 years ago

@dprotaso: Closing this issue.

In response to [this](https://github.com/knative/serving/issues/643#issuecomment-489761610): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dprotaso commented 5 years ago

it is done.