kubernetes / community

Kubernetes community content
Apache License 2.0
12.01k stars 5.17k forks source link

Refresh our approach to conformance testing and create a distribution certification program #432

Closed bgrant0607 closed 7 years ago

bgrant0607 commented 7 years ago

Forked from the system layers draft.

There are more than 30 Kubernetes distributions and services today. The portability of applications and tools across distributions is crucial to the health of the Kubernetes ecosystem. It is therefore desirable to foster maximal portability and to discourage fragmentation.

At least one component of such a strategy could be a certification program that would need to be followed in order to use the Kubernetes trademark. We may also want to reinforce other desirable behaviors, such as:

Programs from other OSS communities, for possible inspiration:

Our current cluster conformance tests are described here: https://github.com/kubernetes/community/blob/master/contributors/devel/e2e-tests.md#conformance-tests

The current set of tests was the result of an effort to exclude non-portable tests more than to ensure we had adequate/appropriate coverage of the API and functionality surface to ensure a Kubernetes distribution or service conformed.

Our current node "conformance" (validation) tests are described here: https://kubernetes.io/docs/admin/node-conformance/

These were designed to verify that a node was ready to join a cluster rather than to ensure full coverage of Kubelet functionality.

Somewhat independent of distribution conformance, it would be useful to have more such component-level validation tests, especially for all pluggable components, such as the pod network, scheduler, ingress controller, service proxy, and state store (etcd).

There are a number of questions that we need to answer, such as how to deal with optional features/components? Clearly required features need to be tested, but it would also be useful to verify correct operation of all features that are present and enabled, using available feature discovery mechanisms.

We likely need to define multiple distribution profiles, such as:

cc @thockin @smarterclayton @WilliamDenniss

bgrant0607 commented 7 years ago

Copying a comment from @justinsb:

IMO we cannot require upstreaming, but we should make it the optimal behaviour.

In particular: The pace of progress in core must continue to outpace the forks, and we should ensure that forks do not block core. So when people are objecting to a PR because it makes maintaining their non-upstreamed code harder, we should probably say "this is why you probably don't want to maintain a fork".

bgrant0607 commented 7 years ago

cc @timothysc

timothysc commented 7 years ago

Building out e2e "Conformance" tests to be a fully featured checklist seems like the most tenable path towards this goal, and focuses on client<>api<>behavior vs. the details under the hood, which changes from vendor to vendor.

Ideally we could have a shared CNCF testing infrastructure that provides an incentive for vendors to hook in their distribution to get "approved" for a release.

/cc @jbeda

bgrant0607 commented 7 years ago

Another possible strategy for non-pluggable components, which some other projects use, is to mandate usage of upstream code somehow.

WilliamDenniss commented 7 years ago

Certification is a great way to promote interop. It would be fantastic if moving workloads between certified distributions was predictable and effortless. I like the idea of an image mark to denote compliance, and shared CNCF testing infrastructure to validate conformance.

It would be great if pluggable components were included in certification, it's possible to have very popular components that a lot of developers rely on, impacting portability.

Another example of a successful community-run certification program: OpenID Connect certification. This program consists of foundation-hosted test infrastructure open to all to use, and uses a self-certification process with test output as evidence.

bgrant0607 commented 7 years ago

cc @philips

bgrant0607 commented 7 years ago

cc @dankohn

dankohn commented 7 years ago

CNCF stands ready to kick off a working group to organize a conformance and labeling program.

bgrant0607 commented 7 years ago

Some examples of Kubernetes distributions, services, appliances, etc., in no particular order:

bgrant0607 commented 7 years ago

cc @erictune

smarterclayton commented 7 years ago

In practice (at least with regards to forking) - Kubernetes moves glacially slowly w.r.t. features and thorny issues. Which incentivizes forkers to create more forks. So if kube actually moved faster (or split components out) more of the need for forks diminishes.

On Mon, Mar 6, 2017 at 7:23 PM, Brian Grant notifications@github.com wrote:

cc @erictune https://github.com/erictune

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/432#issuecomment-284579848, or mute the thread https://github.com/notifications/unsubscribe-auth/ABG_p7pMuG2HsAuObCP-f4scGdz02Tr0ks5rjKObgaJpZM4MUgpF .

timothysc commented 7 years ago

@dankohn I believe this must belong to the cncf to support the shared testing infrastructure to enable both tests and reporting.

luxas commented 7 years ago

/cc

timothysc commented 7 years ago

@kubernetes/sig-testing-feature-requests

timothysc commented 7 years ago

I'll be deep in a multi-pronged cleaning of the e2es and will try to rationalize a doc with the testing sigs and communicate that back over the course of 1.7. Before we even pontificate what the meaning of "is" ...is ;-) we need to cleanup what is there.

josephjacks commented 7 years ago

@bgrant0607 I have merged your list above with mine in a living sheet along with some more dimensional data that might prove useful. Happy to work with @dankohn in the new WG on next steps. The list is now up to 45+ and growing. Some pruning and corrections might be needed, but I think it is a decent start. https://docs.google.com/spreadsheets/d/1LxSqBzjOxfGx3cmtZ4EbB_BGCxT_wlxW_xgHVVa23es

dankohn commented 7 years ago

Thanks, JJ, very helpful. I bless your spreadsheet as the official one as long as you promise to keep it up to date. Cc @WilliamDenniss

josephjacks commented 7 years ago

@dankohn happy to with support from the community which has been great so far mostly thanks to twitter. :)

bgrant0607 commented 7 years ago

Initial certification program has launched!

WilliamDenniss commented 7 years ago

ICYMI: https://www.cncf.io/announcement/2017/11/13/cloud-native-computing-foundation-launches-certified-kubernetes-program-32-conformant-distributions-platforms/