Automated releases - Githubissues

iwankgb commented 3 years ago

To increase frequency of releases and allow us to fix monitoring bugs without affecting Kubernetes we need an automated way of releasing code. A release must consist of:

amd64 and aarch64 binaries
amd64 and aarch64 OCI images
some sort of a release note (I guess that list of PRs will be enough for the time being).

CC: @bobbypage

dims commented 3 years ago

Could we explore a CI job that would update k8s (locally) and run a bunch of things to get some confidence that we don't totally break k8s?

iwankgb commented 3 years ago

@dims sure, can you suggest what "bunch of things" could be?

dims commented 3 years ago

let's look here? https://cs.k8s.io/?q=cadvisor&i=nope&files=%5Etest%2Fe2e.*&excludeFiles=&repos=kubernetes/kubernetes

iwankgb commented 3 years ago

I had a chat about cAdvisor affecting Kubernetes stability with @bobbypage some time ago and I suggested following approach:

a release is prepared for Kubernetes release, as usual.
bugfixes affecting Kubernetes keep being ported to the branch.
once a bugfix that does not affect Kubernetes arrives - a new release is created so that cAdvisor community can use it
any bugfixes affecting Kubernetes will still be ported to original branch.

With decent release automation:

it should not take that much time to handle all these releases
fixes to cAdvisor will not affect stable version of Kubernetes
necessary fixes will still hit Kubernetes

I don't like the idea of making cAdvisor builds dependent on Kubernetes test because of substantial amount of flakes that we face there.

iwankgb commented 3 years ago

Execuse my mad photoshop skillz.

bobbypage commented 3 years ago

Thanks @iwankgb for putting together sketch of the proposal. I agree, having more automated and easy way to release will definitely help streamline the process, especially for cherrypick changes to fix issues in existing branch.

I think there's a few things here, so worth to separate them:

Generally making release more automated instead of current manual steps as defined in https://github.com/google/cadvisor/blob/master/docs/development/releasing.md
Change of release cadence, i.e. change existing model of keeping cAdvisor release in sync with k8s
aarch64 images / binaries

Overall, #1 and #3 above clearly will help, so no questions there :)

Regarding #2, as I understand the main change there will basically be changing existing release schedule of having a single release timed in sync with k8s release. Instead we'll have two "active" releases, one that will be used for k8s and one that can be used standalone so that we can cherrypick changes to appropriate version as needed. I think that makes sense, especially with more easy release automation as you mentioned, which should hopefully keeps things straightforward.

Regarding having automated k8s tests as @dims mentioned, I agree it would nice to have, just to have confidence that cAdvisor is not causing some obvious kubelet breakage... We currently do have the prow cAdvisor e2e test, perhaps something like the summary test (https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/summary_test.go) would be good candidate as most of those metrics originate from cAdvisor. I'm not clear on how we can easily hook up something to run that test though (say on PRs or releases) though..., any ideas?

iwankgb commented 3 years ago

Perhaps we could try to run these tests when we merge a PR to a branch that is used for a Kubernetes-focused release and as a part of any release process? I'm not sure how tricky it is to integrate the test @bobbypage mentioned into the pipeline. Do you think that we can:

solve points 1 and 3 first;
figure out way of running some Kubernetes conformance test in subsequent PR?

A release pipeline is relatively straightforward but running K8s test is something I will have to dig into.

bobbypage commented 3 years ago

That sounds like a plan, having k8s node test would be great to have but is separate topic from automating release process.

google / cadvisor

Automated releases #2834