kubernetes / cloud-provider

cloud-provider defines the shared interfaces which Kubernetes cloud providers implement. These interfaces allow various controllers to integrate with any cloud provider in a pluggable fashion. Also serves as an issue tracker for SIG Cloud Provider.
Apache License 2.0
244 stars 109 forks source link

Standardize the Cloud Controller Manager Build/Release Process #36

Open andrewsykim opened 5 years ago

andrewsykim commented 5 years ago

Right now each provider is building/releasing the external cloud controller manager in their own way. It might be beneficial to standardize this going forward or at least set some guidelines on what is expected from a cloud controller manager build/release.

Some questions to consider:

We've had this discussion multiple times at KubeCONs and SIG calls, would be great to get some of those ideas vocalized here and formalize this in a doc going forward.

cc @cheftako @jagosan @hogepodge @frapposelli @yastij @dims @justaugustus

NeilW commented 5 years ago

First thing to sort out is how to update the modules, so that Go module updates work correctly.

The standard main.go has dependencies on 'k8s.io/kubernetes' and 'k8s.io/component-base'

Component base isn't semantically versioned properly, and fetching the main kubernetes module causes a load of version failures as the staging redirect 'replace' entries in the go module file don't apply in an external structure.

NeilW commented 5 years ago
$ go get k8s.io/kubernetes@v1.15.0
go: finding k8s.io/apiextensions-apiserver v0.0.0
go: finding k8s.io/apiserver v0.0.0
go: finding k8s.io/kube-proxy v0.0.0
go: finding k8s.io/cloud-provider v0.0.0
go: finding k8s.io/kube-scheduler v0.0.0
go: finding k8s.io/cluster-bootstrap v0.0.0
go: finding k8s.io/csi-translation-lib v0.0.0
go: finding k8s.io/client-go v0.0.0
go: finding k8s.io/kubelet v0.0.0
go: finding k8s.io/sample-apiserver v0.0.0
go: k8s.io/csi-translation-lib@v0.0.0: unknown revision v0.0.0
go: k8s.io/cluster-bootstrap@v0.0.0: unknown revision v0.0.0
go: k8s.io/cloud-provider@v0.0.0: unknown revision v0.0.0
go: k8s.io/kubelet@v0.0.0: unknown revision v0.0.0
go: k8s.io/kube-scheduler@v0.0.0: unknown revision v0.0.0
go: k8s.io/kube-proxy@v0.0.0: unknown revision v0.0.0
go: k8s.io/apiserver@v0.0.0: unknown revision v0.0.0
go: k8s.io/apiextensions-apiserver@v0.0.0: unknown revision v0.0.0
go: k8s.io/sample-apiserver@v0.0.0: unknown revision v0.0.0
go: finding k8s.io/apimachinery v0.0.0
go: finding k8s.io/kube-controller-manager v0.0.0
go: finding k8s.io/kube-aggregator v0.0.0
go: finding k8s.io/metrics v0.0.0
go: k8s.io/client-go@v0.0.0: unknown revision v0.0.0
go: finding k8s.io/code-generator v0.0.0
go: finding k8s.io/cri-api v0.0.0
go: finding k8s.io/legacy-cloud-providers v0.0.0
go: finding k8s.io/component-base v0.0.0
go: finding k8s.io/cli-runtime v0.0.0
go: finding k8s.io/api v0.0.0
go: k8s.io/kube-controller-manager@v0.0.0: unknown revision v0.0.0
go: k8s.io/kube-aggregator@v0.0.0: unknown revision v0.0.0
go: k8s.io/legacy-cloud-providers@v0.0.0: unknown revision v0.0.0
go: k8s.io/code-generator@v0.0.0: unknown revision v0.0.0
go: k8s.io/metrics@v0.0.0: unknown revision v0.0.0
go: k8s.io/apimachinery@v0.0.0: unknown revision v0.0.0
go: k8s.io/cri-api@v0.0.0: unknown revision v0.0.0
go: k8s.io/component-base@v0.0.0: unknown revision v0.0.0
go: k8s.io/cli-runtime@v0.0.0: unknown revision v0.0.0
go: k8s.io/api@v0.0.0: unknown revision v0.0.0
go: error loading module requirements
andrewsykim commented 5 years ago

Thanks @NeilW! I agree that removing imports to k8s.io/kubernetes will help the case here. There were some discussions in the past to move k8s.io/kubernetes/cmd/cloud-controller-manager to either k8s.io/cloud-provider/cmd/cloud-controller-manager or k8s.io/cloud-controller-manager. The tricky thing with that is now all cloud-specific controllers also need to move to an external repo now since you can't import k8s.io/kubernetes from a staging repo. Would love your thoughts here on what would be ideal for your provider. cc @timoreimann for feedback from digitalocean

re: k8s.io/component-base not being semantically versioned, can you open an issue in kubernetes/kubernetes for that?

NeilW commented 5 years ago

I've spent a day struggling with 1.15 and I've still not managed to get the dependencies sorted out for the cloud-provider. It looks like I'll have to manually code 'replace' entries for all the repos in the 'staging' area of the kubernetes repo. So we definitely have a problem.

However that does open up a possibility for making cloud-providers more standard. If you built a dummy provider that responded to end to end tests, and published in a standard way, but didn't actually do anything, then you could 'replace' that provider's interface repo path with a path to a provider's repo that implements the same interface.

That allows you to simply replicate the standard repo as say 'brightbox-cloud-provider' and just change the 'replace' entry in the 'go.mod' to point to say 'brightbox/brightbox-cloud-provider-interface'. Then you can follow the same automated integration testing and deployment/publishing process as the standard dummy provider.

And on the interface repo that people like me maintain, we can run unit tests and set up the dependencies with our own 'go.mod' completely decoupled from the cloud-provider 'shell' the interface will be compiled into.

NeilW commented 5 years ago

In terms of a publishing process, the one I use with Hashicorp to publish our terraform provider is a good one. I go on a slack channel and ask them to roll a new release, and after a few manual checks the maintainer of the central repo holding the providers hits the go button on the automated release system.

Now Hashicorp have staff managing that central provider repo(https://github.com/terraform-providers), and that may not work with k8s given the nature of the project. But it's something to consider.

timoreimann commented 5 years ago

I haven't upgraded DigitalOcean's CCM to 1.15 yet, but I do remember that moving to the 1.14 deps was quite a hassle. For instance, it required adding a replace directive for apimachinery which wasn't obvious for me to spot.

I noticed that the latest client-go v1.12 (corresponding to Kubernetes 1.15 as it seems) encodes these replace directives in its go.mod file now. My guess is that, if cloud-provider followed the same pattern of accurately pinning down dependencies per each release through Go modules, consumption of cloud-provider should become easier.

@NeilW's idea of providing a dummy provider is interesting, though I'm not sure I fully grasped yet how that'd be consumed. In general, I'd definitely appreciate a sample provider that described the canonical way of setting up a custom cloud provider; last time I went over some of the available implementations from the different clouds, they all had slight variations, which could easily have been the case because their development cycles can't possibly be synchronized perfectly; or maybe there are legitimate reasons to have divergent setups?

It'd be great to have a "source of truth" that outlines one or more recommended setups (similar to client-go's sample directory).

timoreimann commented 5 years ago

@andrewsykim

There were some discussions in the past to move k8s.io/kubernetes/cmd/cloud-controller-manager to either k8s.io/cloud-provider/cmd/cloud-controller-manager or k8s.io/cloud-controller-manager.

I'm all in favor of removing any dependencies on k8s.io/kubernetes that are currently still in cloud provider since those tend to pull in a fair number of transitive packages (which are presumably not all required?).

What's the benefit of moving the cloud provider command part into a new, separate repository? My gut feeling is that it would be easier to reuse the existing k8s.io/cloud-provider repository we have today. Is there any prior discussion available to possibly gain more context around the various pro's and con's?

NeilW commented 5 years ago

@NeilW's idea of providing a dummy provider is interesting, though I'm not sure I fully grasped yet how that'd be consumed.

Less that we would consume cloud-provider and more that it would consume us.

  1. Copy cloud-provider to a new repo digitalocean-cloud-provider within a k8s organisation that holds and publishes the cloud-providers.
  2. Alter the go.mod and add a replace that says k8s.io/cloud-provider-interface => github.com/digitalocean/digitalocean-cloud-provider-interface vX.Y.Z
  3. Run the release process on that repo, which compiles, builds and tests the cloud-provider then publishes the container somewhere central.

We then just build our provider interface libraries to the published Go Interface.

NeilW commented 5 years ago

In terms of updating to 1.15

Hope that saves somebody a lot of time.

andrewsykim commented 5 years ago

/assign @yastij

andrewsykim commented 5 years ago

For v1.16: consensus on what the build/release process for CCM should look like.

yastij commented 5 years ago

A couple of things:

also I think we should start publishing binaries stripped from in-tree cloud-providers, this would help to drive adoption. cc @kubernetes/release-engineering

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

cheftako commented 4 years ago

/remove-lifecycle stale

cheftako commented 4 years ago

/lifecycle frozen

andrewsykim commented 4 years ago

/help

k8s-ci-robot commented 4 years ago

@andrewsykim: This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes/cloud-provider/issues/36): >/help Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
andrewsykim commented 4 years ago

@cheftako to put together short proposal for v1.19