gravitational / planet

Installable Kubernetes delivered in containers
Apache License 2.0
51 stars 18 forks source link

Add publishing to Drone CI #813

Closed wadells closed 3 years ago

wadells commented 3 years ago

Summary

This changeset adds two further drone pipelines covering our two planet publishing scenarios.

Replacing https://jenkins.gravitational.io/job/Planet-dev-deploy/, the trigger: event: custom pipeline performs a dev deploy to s3 when manually invoked with a command like the following:

drone build create gravitational/planet --branch walt/drone-publish

There is no way to trigger a dev deploy via the Drone UI

Replacing https://jenkins.gravitational.io/job/Planet-Deploy-Artifacts/, the "trigger: event: tag" pipeline will publish to both s3 and quay every time a tag is pushed.

Again, there is no way to trigger a full deploy via the Drone UI. It "just happens" when a tag is pushed.

Testing Done

I pushed the 7.1.22-11709-test tag to test the publishing job. Find the build logs and artifacts below:

https://drone.gravitational.io/gravitational/planet/7 https://quay.io/repository/gravitational/planet?tab=tags https://s3.console.aws.amazon.com/s3/buckets/clientbuilds.gravitational.io?region=us-east-1&prefix=planet/7.1.22-11709-test/&showversions=false

I used the following drone command to test the dev publishing:

drone build create gravitational/planet --branch walt/drone-publish 

Resulting in:

https://drone.gravitational.io/gravitational/planet/10 https://s3.console.aws.amazon.com/s3/buckets/clientbuilds.gravitational.io?region=us-east-1&prefix=planet/7.1.22-11709-4-g591e4fd1/&showversions=false

Notes

The AWS credentials are limited to s3 read write on the clientbuilds.gravitational.io/planet bucket/path. They are managed in code, provisioned in https://github.com/gravitational/cloud-terraform/pull/109. The quay credentials are artisanally hand provisioned, and can be found under gravitational+drone_planet_publish here:

https://quay.io/organization/gravitational?tab=robots

TODO

knisbet commented 3 years ago

@wadells Feel free to add any requirements about integrations with a registry to https://github.com/gravitational/teleport/issues/5225

wadells commented 3 years ago

@lenko-d

I noticed that there are 2 pipelines for 2 environments that are essentially the same source code. This introduces duplication.

The duplication is worse than this, as both of the publish pipelines are 70% duplicated from the pr pipeline.

Is it going to be possible to parametrize the source code and make it to accept an ENV parameter? Maybe create a common pipeline that gets invoked by other pipelines for each ENV? Or some other Drone specific way to reduce duplication.

Yes. I mocked up a couple (untested) implementations:

https://github.com/gravitational/planet/blob/walt/drone-publish-env/.drone.yml#L109 https://github.com/gravitational/planet/blob/walt/drone-publish-when/.drone.yml#L93-L95

Take a look and see if either appeals to you. There are a couple other options that I didn't mock up because they require storing temporary artifacts in stable storage (s3).

I didn't chase either of these because they likely involve storing development artifacts for a wide set of builds, and thus having some out of band clean up process or a much more rapidly growing set of builds to be stored indefinately. Honestly, promotions seems like the most correct way to implement it, but I didn't want to scope creep to reworking our artifact storage.

Lastly, we could use Starlark to deduplicate everything (by implementing build configs in a python). Although I'd find this personally satisfying to learn, I really don't like it from an approchability perspective. Using starlark would make it much more difficult for folks to contribute to and maintain the CI configs.

I'm trying to stay focused on the path to turn jenkins off.

wadells commented 3 years ago

Philosophically: I don't like putting complexity or significant logic into our CI configuration. I'm of the opinion that CI configs should be as minimal as possible, and primarily wrap nuance implemented by the local build (Make). This has the following advantages:

With this in mind, I personally like the "dumb as rocks" repetition of logic for each pipeline. Yes, it is verbose and not at all DRY, but it is also accessible, explicit, and KISS.

knisbet commented 3 years ago

Philosophically: I don't like putting complexity or significant logic into our CI configuration. I'm of the opinion that CI configs should be as minimal as possible, and primarily wrap nuance implemented by the local build (Make). This has the following advantages:

* Reproducing CI/publishing failures locally is a relatively simple task: Run the failing target CI from your workstation.

* Learning one tool (make, mage, et al) allows you to understand the important parts of the build for many projects.

* CI systems become relatively pluggable.  E.g. the migration from jenkins -> drone isn't nuanced.  If we put more logic into Jenkins/Drone, we make it harder to move to the next system.

With this in mind, I personally like the "dumb as rocks" repetition of logic for each pipeline. Yes, it is verbose and not at all DRY, but it is also accessible, explicit, and KISS.

I agree, although maybe living with it is a different story, I'm currently a huge fan of this approach with my sort of fizzled push for mage. I sort of see CI/CD as a system for running actions based on git events, mapping some secrets, and being able to store the output, but otherwise really like the idea of the code itself in go. So in my ideal world, the drone pipeline is really just one thing, like go run mage.go target with required secrets and capturing/storing the output.

But that's my 2 cents, I certainly haven't lived with this type of setup in the long run.

lenko-d commented 3 years ago
* using drone [promotions](https://docs.drone.io/promote/) to "upgrade" a built to a higher level -- more or less exactly intended for this dev/prod workflow.

Promotions look very good to me. I would probably use that if it is easy to implement.

wadells commented 3 years ago

Promotions look very good to me. I would probably use that if it is easy to implement.

The ux problem I was working through with promotions is:

We need an initial build to promote. The ways I've brainstormed for getting this are:

1) Opening a bogus PR to trigger the pr build config. This is undesirable as it is an abuse of PRs -- a mainstay of our development culture. Out of the question IMO. 2) Trigger a build manually with drone build create. This is the most targeted solution, but results in each publish requiring at least 2 invocations of the drone cli. Not as nice as the one job jenkins workflow we have currently. To make the experience nicer, this workflow could be wrapped in a make target, creating a CI ouroboros. Make calls drone calls make. 3) We build every commit pushed to the main repo. This is nice verification, but results in a ton more builds than necessary -- especially with our culture of using a centralized repo for all sharing. Planet has 81 branches, contrasted with 5 open PRs and 4 release branches. Systemically, we only care about building commits tracked for merge. The planet build is heavy enough that if we saw higher traffic this would be a notable waste of resources. It'd definitely be out of the question for something as expensive as the gravity build.

After this initial build is created, the promote workflow has users run a 2nd drone build promote to dev or prod. Again, here we either rebuild everything, or we need a non-drone cache of artifacts. Rebuilding is wasteful of resources, and caching artifacts is non-trivial scope creep.

There is also something a bit wierd with tagging. Production builds are expected to have a tag (though the automation doesn't require it) whereas dev builds may or may not. The tag may be created after the initial drone build create, but before the drone build promote. Using a promotion based workflow makes the "publish a release" step more complex. It goes from

push a tag

to

push a tag
drone build create
wait for build to finish?
drone build promote

We could have the deploy step create and push a tag, but my gut doesn't like giving drone push access to our repo -- I'd prefer to keep that limited to humans.

The other thing is, in terms of you initial duplicated config concern: promotion is largely a wash. We'd still see duplicated pipelines, or branching using env/when.

In short, it is easy to implement a worse UX with a promotion workflow. It is possible but hard to implement

Honestly, if we could drop the dev publish entirely (and have folks run that using their creds from their workstation), the CI problem would go away. We'd only have one publish, and it would be triggered on tag push. This raises the bar to entry though, and makes contributing to planet less friendly than it already is. Probably not a good solution.

wadells commented 3 years ago

Considering https://github.com/gravitational/planet/pull/813#issuecomment-756221619 and https://github.com/gravitational/planet/pull/813#issuecomment-755754037, I've examined several alternatives, and--to me--none are obviously better than what is implemented in this PR. Each alternative has tradeoffs in timeline, maintenance or UX.

With that in mind, I'd like to merge the current implementation to start getting folks used to working with Drone. I propose we revisit this discussion in a couple weeks once I've had a chance to see what Gravity CI/release looks like and determine the appropriate patterns for that use case. Overall, I'd like Planet CI workflows to be consistent with Gravity -- but I don't entirely know what Gravity will look like until I port it. I'm still building up experience with Drone.

What say you @lenko-d?