knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.57k stars 1.16k forks source link

Moving "routing mode" to the route (from service) #1865

Closed cppforlife closed 6 years ago

cppforlife commented 6 years ago

/area API /kind feature

Currently Service object has two fields runLatest and pinned [1]. They control how service manages associated Route object:

As a side effect, if a user is using Service object to manage Configuration and Route objects, there is no way to do traffic splitting, because if route is reconfigured, service controller will just reconcile it based on two above configurations. (To do traffic splitting, user would actually have to stop using service object and managed configuration and route objects directly.)

Proposed change:

type Service struct {
    Spec ServiceSpec
}

type ServiceSpec struct {
    Configuration ConfigurationSpec
}

^ removes runLatest and pinned types from Service.Spec

// part of Route spec

type TrafficTarget struct {
    Name string
    RevisionName string
    ConfigurationName string
    Percent int
}

^ TrafficTarget (and Route) does not change, as it already supports running latest ready revision of particular Configuration object (via ConfigurationName) or pointing to particular revision (via RevisionName) (pins traffic to particular revision).

Few cases:

Benefits:


[1] https://github.com/knative/serving/blob/b1d14bd5c8cfadd6d8eb15ab2ed94324a1c13c29/pkg/apis/serving/v1alpha1/service_types.go#L72-L81

mattmoor commented 6 years ago

Since there's no way to control the Route spec via Service, this effectively eliminates the ability to express "pinned".

mattmoor commented 6 years ago

cc @mikehelmick @cooperneil

mattmoor commented 6 years ago

This is similar to a sentiment I expressed on one of the earlier proposals for "Steve" (what's now "Service"), which is that some of the rollout options were probably most relevant on TrafficTarget with a ConfigurationName to control how LatestReadyRevisionName is rolled out.

cppforlife commented 6 years ago

@mattmoor

Since there's no way to control the Route spec via Service, this effectively eliminates the ability to express "pinned".

i imagine user will just simply update route object to use particular RevisionName for pinning. not a big difference between configuring service object imho.

sixolet commented 6 years ago

The purpose of Service is to act as a clearinghouse for rollout/rollback policies; if the user wants to control these in a fine-grained way using Route, there's no reason for that user to use Service at all.

Service is likely to grow more options than runLatest and pinned, and become more useful in that way.

mattmoor commented 6 years ago

To clarify, I don't think that these are mutually exclusive either. We may want to build the extra sophistication at the "Steve" level via a rollout option, that exposes richer functionality that's implemented at the TrafficTarget level. However, I'm not sure this will always be the case.

cppforlife commented 6 years ago

related: https://github.com/knative/serving/issues/1870

voor commented 6 years ago

Isn't the intention that if you were to push a new revision using the same containers, you would then have the option to rollback to an earlier traffic split pattern if something were to go wrong? So instead of editing a route to change traffic to 50/50, you should push a new revision with a 50/50 split, and if something were to go wrong you would then have the option to rollback to an earlier traffic pattern?

steren commented 6 years ago

@sixolet said:

The purpose of Service is to act as a clearinghouse for rollout/rollback policies; if the user wants to control these in a fine-grained way using Route, there's no reason for that user to use Service at all.

This is not correct. When we introduced the Service object, the intent was not to prevent users from controlling the Route. It was introduced to solve the problems stated in #412.

And as @mikehelmick commented in https://github.com/knative/serving/issues/412#issuecomment-377076191, the agreement was to provide a "manual mode". I am not sure why this was not implemented.

we will also specify a new manual mode where the user is free to interact with the resources created by the service object directly

mikehelmick commented 6 years ago

Hello. Here I have a solution to the issues raised here. I want to start by referring back to the original Service proposal, https://github.com/knative/serving/issues/412#issuecomment-377076191

In this, we discussed a future manual mode for service that allows the user to manipulate their associated Route and Configuration objects directly, without the Service controller continuing to reconcile those objects.

Objective

The Service object ("Steve") originally implemented two modes, RunLatest and Pinned. These modes allow a user to always have the latest Revision running at 100% of traffic or to pin a specific revision to 100% of traffic.

Here, we formally propose a new mode for controlled releases (called simply “Release” mode, that also replaces the functionality of pinned mode. (Pinned mode will be marked as deprecated)

The opinionated release process here, provides a way for a user to automatically promote new releases, gradually roll out new releases, and rollback to a known good release.

We also formally introduce the long rumored “manual mode” that allows for full control over the route object.

This closes out the modes of Service that will be created unless user feedback tells us that other scenarios are needed.

Deprecation of Pinned Mode

Pinned mode of service exists in v1alpha1. We propose to drop pinned mode from future releases, and replace it with release mode (below) that provides the same functionality and extends it further.

The existing functionality of Pinned mode can be enacted by using release mode and setting the revision list to size of 1.

Introduction of Release Mode

The principle of release mode is that it provides an easy way for a user to gradually promote a new release, and also contains the existing functionality of pinned mode. Release mode gives an opinionated way of performing release management, giving well-known subroute names (current, next, and latest).

The specification for release mode is:

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
  name: Counter 
spec:
  Release:
    # An ordered list of 1 or 2 revisions. The first have a TrafficTarget
    # with a name of "current" and the second will have a name of "next"
    revisions: ["Counter.1", "Counter.4"]
    # Percent of traffic that should be sent to the next revision. Valid values
    # are between 0 and 99 inclusive. 
    rolloutPercent: 50   
    # All revisions referenced must come from this configuration.
    # The latest revision from head, will be available at a 0 percent TrafficTarget
    # with a name of 'latest'
    configuration:
      ...
status:
  ... 
  # The TrafficTarget array will reflect the effective traffic splits

This provides a simple, intuitive way for users to have a disciplined release process, gradually moving from one release to another.

It provides standardized naming for current, next, and latest and an easy way to manage your release process.

Manual Mode

RunLatest and Release modes cover common user scenarios. If a user starts in Service mode, but later needs more advanced functionality, they need to be able to directly access the route and configuration objects, but may still want to enforce grouping under the Service.

When in manual mode, the service controller does not attempt to reconcile the configuration or the route.

manual mode simply references the ``Route and Configuration by name. The implementation of the Knative API may prevent a service from being created in manual mode, requiring it to be created in RunLatest or Release and then transitioned to manual.

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
  name: Counter
spec: 
  # The user is in complete control of the route and configuration that
  #  are part of this service. The service controller will not attempt
  #  reconcile changes.
  manual: {}
status:
  ...

Implementation Considerations

The proposed changes here are fully in the spirit of Knative Service object, providing an “easy mode” for developers to manage their service. It also provides an exit to manual mode if more powerful functionality is desired.

scothis commented 6 years ago

One thing that bugs me with the current Service is that the ConfigurationSpec can exist in multiple places. This forces tool makers to hunt for where the configuration actually lives. It looks like that dichotomy lives on between the runLatest and release modes.

Is there a reason to keep the configuration in multiple places (aside from backwards compatibility) rather than pull the configuration up to the top level? That way the mode becomes more clearly a routing concern.

mikehelmick commented 6 years ago

This was originally included in RunLatest and Pinned to ensure that a service can be created and start serving with a single API call. Since for those modes, the Service is responsible for ensuring that the route and configuration stay in the correct state, we determined that the Service should have all necessary information to materialize the dependent objects since it is that controller's job to keep them in sync (for those guided modes)

scothis commented 6 years ago

@mikehelmick I agree with including the ConfigurationSpec in the Service, and only needing to create a single resource. What I'd like to see is the configuration be a sibling to the route modes so that its location in the resource is unambiguous. Based on the release example above:

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
  name: Counter 
spec:
  # All revisions referenced must come from this configuration.
  # The latest revision from head, will be available at a 0 percent TrafficTarget
  # with a name of 'latest'
  configuration:
    ...
  release:
    # An ordered list of 1 or 2 revisions. The first have a TrafficTarget
    # with a name of "current" and the second will have a name of "next"
    revisions: ["Counter.1", "Counter.4"]
    # Percent of traffic that should be sent to the next revision. Valid values
    # are between 0 and 99 inclusive. 
    rolloutPercent: 50
status:
  ... 
mikehelmick commented 6 years ago

That makes sense. This was originally bundled within the specific service mode, since we had planned ahead for service modes that might have more than one configuration, but as of now, I don't anticipate that we will need those modes.

I'm fine with moving the configuration out to the top level of the spec. We'll need to think about how to do this in a non-breaking way.

kameshsampath commented 6 years ago

can the revisions array as revisionName and routeName( this can be mapped to name in traffic which in turn used to generate the URLS), thereby when the routes are created via the service object can be like

apiVersion: serving.knative.dev/v1alpha1
kind: Route
metadata:
  name: demo
  namespace: default
spec:
  traffic:
  - revisionName: demo-00001
     percent: 60 
     name: v1
  - revisionName: demo-00002
    percent: 40
    name: v2 
mikehelmick commented 6 years ago

Hi @kameshsampath - I think what you're suggesting is the same as what I'm suggesting, I just didn't fall it out with an example route object.

In the example Release mode that I posed the resulting Route object would look like this:

apiVersion: serving.knative.dev/v1alpha1
kind: Route
metadata:
  name: Counter 
spec:
  traffic:
  - revisionName: Counter.1
     percent: 50
     name: current
  - revisionName: Counter.4
     percent: 50
     name: next
  - configurationName: Counter
     percent: 0
     name: latest
status:
  ... 
mikehelmick commented 6 years ago

@sixolet - Please update normative examples to reflect this.

steren commented 6 years ago

We performed user studies with this proposal. One interesting suggestion from a participant was to use "candidate" instead of "next". I find the idea interesting, because indeed, we often talk about a "release candidate".

tliberman commented 6 years ago

@carolynknight87 and I conducted users studies with 7 developer participants who read through an explanation of release mode to better understand current, latest, next. Of those three concepts, current and latest were well understood and participants grasped them immediately. next was not as easily understood and two participants suggested “release candidate” would’ve been clearer.

User quote: “next threw me off to think it was different and would get ahead of latest