Closed andreeaflorescu closed 5 years ago
Having a centralized place for those pipeline definitions sounds great, we avoid duplication and simplify CI management.
I guess that in the future, the test scenarios may become more complex, so this repo may hold other files as well (e.g. other job definition files, scripts, simple tools, etc).
Renamed the proposed named to rust-vmm-ci as I was thinking to use it for other common tests as well. For example we can have test_coverage.py there so other projects can reuse it.
Additional proposal: we can have one pipeline for each platform and operating system we want to support. And let me tell you why:
With our current configuration of crates, platforms and operating systems we would end up having 3 pipelines:
Some repos might need different pipelines, perhaps to prepare something: https://github.com/rust-vmm/linux-loader/pull/2#discussion_r282446524
For this specific example it might make sense to just have a ci-prepare.sh in that repository and then have the common script [test -a ci-prepare.sh] && ci-prepare.sh
@rbradford you can have custom hooks executed before running a command or before checking out the repository code. I think the prepare.sh script should be called in the hook and not part of the pipeline step so that the steps can be independent of the repository being tested. What do you say?
If you want to learn more about hooks you can check the Buildkite hooks documentation.
This is a summary of the discussion points from the email thread. @sboeuf @petrutlucian94 please correct me if there is anything I misunderstood.
There are currently two proposals:
Also two proposals:
@petrutlucian94 "each buildkite agent runs one step/job at a time, so I think we should run multiple agents per host (e.g. by passing “—spawn
* script that pulls the latest CI. @sboeuf : "You could have a dedicated "ci" directory (for every crate) in which you could have the same script that pulls the content of the "ci" repo before to call into one pipeline or another (that's the part specific to each crate)."
@sboeuf I don't see a big advantage here with using this instead of regular git submodule. I like having a fixed version for the pipelines because this implies you are only updating to a new version of the CI intentionally. If you are in the middle of a release or in the middle of a bug fix you don't want the CI to break for something that is unrelated to what you are doing.
I don't necessarily want to use submodules either, it just felt like the most natural way of consuming the CI.
Webhooks for the CI
Also two proposals:
* one webhook per repository * one webhook/pipeline
@petrutlucian94 the steps can run in parallel without having multiple steps. So the only thing that would be serial here is the download of the pipelines which should be sub-second. For simplicity I would keep it as one webhook/repository until we have a good reason to complicate ourselves with multiple webhooks. From my understanding, having multiple webhooks also means having multiple Buildkite projects because one project supports only one webhook. By project I mean kvm-ioctl-ci
which where we define the pipeline.
Any other reason for having multiple webhooks other than speed of CI?
Parallelizing the Jobs
@petrutlucian94 "each buildkite agent runs one step/job at a time, so I think we should run multiple agents per host (e.g. by passing “—spawn
”). One idea would be to just run one agent per host cpu core."
Right now the load is very low. Once we add more repositories and the load increases we can scale up the number of agents. I wouldn't necessarily have one/core because it's a waste of resources since they aren't actually going to do anything most of the time. Maybe we can use auto-scaling for this particular issue.
By project I mean kvm-ioctl-ci which where we define the pipeline.
Probably we got the terms mixed up. So kvm-ioctl-ci itself is a buildkite pipeline, by splitting it into multiple pipelines I thought we'd end up with something like kvm-ioctl-ci.arm_linux, kvm-ioctl-ci.x86_linux, etc, each requiring its own webhook.
So the only thing that would be serial here is the download of the pipelines which should be sub-second.
From what I can tell, those build steps wait for agents to become available. Each agent runs one step at a time and there's a single agent per platform defined at the moment. For example, as we can see here https://buildkite.com/rust-vmm/kvm-ioctls-ci/builds/198, "check-warnings-arm" ran in 20s and waited 1m 29s. It's really not that much, but if we get projects with time consuming tests (e.g. that may take more than 10-15 minutes) and multiple simultaneous builds (maybe for different crates), this becomes a bottlenck, while the CI host remains underutilized.
I did a sandbox Windows CI for vm-memory and the test duration reduced from 6m31s to 2m18s after simply running more instances of the same agent on the same host (I have just a 6 core vm atm): https://buildkite.com/lpetrut/vm-memory-ci/builds/54 https://buildkite.com/lpetrut/vm-memory-ci/builds/53.
I wouldn't necessarily have one/core because it's a waste of resources since they aren't actually going to do anything most of the time.
If there are no pending tasks, those agents will do nothing (other than running a heartbeat), consuming a negligible amount of resources. Ofc one agent per core would be too much when having 32 logical cores though :). Anyway, that was just a suggestion.
@rbradford you can have custom hooks executed before running a command or before checking out the repository code. I think the prepare.sh script should be called in the hook and not part of the pipeline step so that the steps can be independent of the repository being tested. What do you say?
And that's still going to work if everything is centralised into a common repo? If so, sounds good!
@andreeaflorescu
I like having a fixed version for the pipelines because this implies you are only updating to a new version of the CI intentionally. If you are in the middle of a release or in the middle of a bug fix you don't want the CI to break for something that is unrelated to what you are doing.
Fair enough, let's use git submodules then!
@rbradford the hooks are on the machines where the builkite-agent is running. So yes. In the hook you have information about the repository being tested so you can have commands that are executed only for some of them.
Probably we got the terms mixed up. So kvm-ioctl-ci itself is a buildkite pipeline, by splitting it into multiple pipelines I thought we'd end up with something like kvm-ioctl-ci.arm_linux, kvm-ioctl-ci.x86_linux, etc, each requiring its own webhook.
@petrutlucian94 I am not sure if this is how it works and I will have to check. My proposal would be to have one pipeline/project with one associated webhook. This is what kvm-ioctls-ci is today. Then, in kvm-ioctls we would have 2 steps:
The two pipelines definitions would have the tests specific for each platform, but they will both be triggered by the same webhook. We can add multiple agents so the tests are run in parallel. Does that make sense?
From what I can tell, those build steps wait for agents to become available. Each agent runs one step at a time and there's a single agent per platform defined at the moment. For example, as we can see here https://buildkite.com/rust-vmm/kvm-ioctls-ci/builds/198, "check-warnings-arm" ran in 20s and waited 1m 29s. It's really not that much, but if we get projects with time consuming tests (e.g. that may take more than 10-15 minutes) and multiple simultaneous builds (maybe for different crates), this becomes a bottlenck, while the CI host remains underutilized.
Yap, I agree. Once we have multiple repository and those repositories have multiple PRs and so we can add more agents.
@rbradford the hooks are on the machines where the builkite-agent is running.
AFAIU that's for the agent hooks. There are also repository scripts that live in the .buildkite/hooks/
, I think this is what we'd want in that case.
@sameo yap, we can have per repository hooks. I missed that one!
For future reference: https://buildkite.com/docs/agent/v3/hooks#repository-hooks
@andreeaflorescu Right, but how would we do that if each repo's .buildkite
is a git submodule?
@andreeaflorescu Let's go an create the repo anyway. We'll find the best way to use it across repos once it's there and filled.
@sameo as discussed offline, I would have rust-vmm-ci as a git submodule in every repository. The rust-vmm-ci
directory will have all the things that are shareable by all repository like a common-pipeline.yml
and the coverage test. To run the common pipeline as part of the CI, you would need to add a step like:
buildkite-agent pipeline upload rust-vmm-ci/common-pipeline.yml
Then each repository will have a .buildkite
folder which contains custom tests in custom-pipeline.yml
and a .buildkite/hooks/
which contains the custom per repository hooks. To run the custom pipeline, you need to add an additional step to buildkite CI:
buildkite-agent pipeline upload .buildkite/custom-pipeline.yml
If the crate doesn't need something fancy for the hooks I think we can use the defaults which will be hosted on the machine that's running the agent. I would expect the default hooks to be overwritten by the ones in .buildkite/hooks
.
I haven't tried out any of these things though, I really hope this is how it works :))
Created the repo: https://github.com/rust-vmm/rust-vmm-ci
@andreeaflorescu Close?
PR: https://github.com/rust-vmm/rust-vmm-ci/pull/2
@bonzini let's close this after the PR is merged.
Repository Name: buildkite-ci
Short Description
The buikdkite-ci repository contains the buildkite pipeline that is going to be used by all other crates to run the CI. The buildkite pipeline is an yaml file with steps. Each step corresponds to one command running as part of the CI.
With buildkite you can either configure the steps in the web interface or using a
.buildkite/pipeline.yaml
file in your repository. The proposal is to have this repository as a git submodule for all crates. In the buildkite configuration for each crate we would then only need to specify the pipeline upload path which would end up being something likebuildkite-ci/pipeline.yaml
. For details about configuring the pipeline path, please check this article. Should we name the submodule common-buildkite-ci since we are most likely going to have another pipeline per repository?Why is this repository relevant to the rust-vmm project?
This is needed so we do not have to duplicate the pipeline for all crates. All crates will end up using the same pipeline for the CI and implicitly we will have the same quality expectations from all crates. If some repository needs additional, specialized tests these can be added in a different, per crate configuration file (pipeline).