kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.95k stars 4.65k forks source link

discuss continuous delivery #871

Closed tomdavidson closed 6 years ago

tomdavidson commented 8 years ago

I want to deliver our k8s clusters with a pipeline. Each cluster will start by forking the starter repo. Pre kops jobs will include a Terraform plan for our VPC config. Post kops jobs will install some k8s add-ons and Helm Charts.

Can anyone share experiences with commit testing, acceptance testing, vendoring and upgrading in a pipeline context?

chrislovecnm commented 8 years ago

so this is an on-going conversation in the K8s community.

  1. what are you using for CI
  2. What are your reqs

We have discussed a plugin with kops and helm ... a post install plugin that would allow for helm installs.

tomdavidson commented 8 years ago

@chrislovecnm thanks for the reply. I don't think kops need to do my helm installs for me. That can be a post kops job task. Im looking at kops as having a theory of "only" the k8s cluster and performing its scope perfectly.

what are you using for CI

Im a fan of GitLab CI / Deploy, but I think we can be k8s & kops specific but ci/cd tool generic. We also use CircleCI, Travis, CodeShip, and last week a team started using AWS CodeDeploy.

What are your reqs

As with microservices architecture (MSA), Kubernetes cluster has many moving parts: masters, nodes, schedulers, pods, Replica Sets, services, labels, selectors, proxies, kubelets, cAdvisor monitoring containers and so on. Just like MSA the separation of concerns is a key design principle, but one that comes with the complexity cost - especially compared to ECS and Swarm. Essentially, I want to treat the k8s cluster simular any other MSA app product to reap the confidence and agility that comes with continuous delivery.

Im a newb to k8s but it seems that kopts does leaps and bounds to mitigate some of the complexity costs and I am interested in using kops my pipelines but am open to other options - much less open to CloudFront based ones. K8s cluster will be one component a monorepo that also includes several Helm installs and a particular VPC config via Terraform. I intended the discussion to focus on the kops context but am open to feedback everywhere.

I typically use four pipelines:

  1. feature/branch
  2. Integration
  3. RC
  4. Release

Each pipeline has:

  1. Commit tests (unit, component, static code analysis, ect).
  2. Test "build/deploy"
  3. Acceptance tests, api, end-to-end, load...
  4. move next pipeline.

I would like to start a change in a git branch. Change the kopts config, maybe instance type, coreos channel etc:

  1. How are updates handled?
  2. Deploy k8s to a review environment - what do acceptance tests of kops and k8s look like, end to end?

The change is committed to the master/integration branch which ends up deployed in the long lived/stage env where other staged apps are deployed.

  1. When will a rolling update not work and require a complete replacement and thus redeployment of all apps deployed on k8s?

Successful integration pipeline results in a release candidate that is deployed to a canary env and finally to the production:

  1. What k8s & kops specific strategies can be used for canary & green/blue deployments of the cluster itself? Can federation help out?
chrislovecnm commented 7 years ago

/area addon-manager

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale

fejta-bot commented 6 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close