Easy-to-deploy CI #1

Open lukego opened 8 years ago

Create easy-to-follow instructions for deploying a private Snabb NFV Continuous Integration (CI) instance. Support both GitHub and GitLab+Jenkins environments.

[ ] Instructions to deploy Snabb NFV functional test CI.
[ ] Instructions to deploy Snabb NFV performance regression test CI.
[ ] Instructions to deploy Snabb NFV OpenStack API extensions test CI.

Related: https://github.com/SnabbCo/snabbswitch/issues/588 Dockerization of CI https://github.com/SnabbCo/snabbswitch/pull/626 [draft PR] Streamlining SnabbBot

@lukego @domenkozar

Regarding the “GitLab+Jenkins” environment: Adding GitLab support to SnabbBot would be super easy (GitLab's API is very similar to the GitHub API it already uses).

However, I assume they want to run the Snabb NFV CI as a Jenkins job and have already integrated Jenkins into GitLab?

I would be very interested to know if there is any scenario where we do not have to interface with Jenkins and can use the GitLab API directly instead (e.g. provide instructions to setup a dedicated Snabb CI machine). After reading more Jenkins documentation I get the feeling that its a giant Pandora's Box that we do not want to open unless necessary.

I've used jenkins in many projects, it should be easy to integrate if they don't use too advanced plugins.

Bear with me here because maybe I am missing important details.

I think the most important thing is for the test suites themselves to be easy to run as a shell one-liner e.g. "docker run mytestcase" or "docker-compose up". There will need to be a way to specify the software version(s) to test e.g. from specific Git repos and commit IDs.

Then it should be pretty straightforward to integrate into any CI framework, right?

Jenkins should actually be the simplest integration according to my mental model: Just click in the webgui to define test cases that run the appropriate shell commands. Then the Jenkins administrator can independently decide for themselves how to trigger builds (e.g. manually, daily, on commit to a repo, etc). End of story?

Inside the Docker containers we could run the tests however we like e.g. to run the OpenStack test suite inside a QEMU instance running NixOS. Sane or crazy talk?

Yes, for the Nix part of tests it will be easy as:

# install Nix
$ git clone https://github.com/user/openstack-tests.git
$ cd nixpkgs
$ nix-build -A tests

And check for the exit code. Jenkins has a plugin to trigger a build on new git push, so that's simple as it can be. I know a few Nix companies are using jenkins with it.

@domenkozar Should we wrap this up in docker and/or QEMU?

I like the idea of minimizing the impact on the host machine where testing is being done: don't require too much software to already be there and don't add too much software either. Ideally I like the idea of Nix being under the hood rather than "one more thing" that the operator has to learn.

I have worked with Jenkins a little in the past and I have had much better experience with running tests in a sandbox (Vagrant) rather than directly on the host machine. This is also operationally much nicer when you don't have to worry about screwing up some state on the host and requiring manual intervention to make the tests work (e.g. reboot, restart some OpenStack daemon that has gotten stuck, see why some filesystem fails to unmount, etc).

(I operated an OpenStack CI based on Jenkins and devstack for a bit over a year. The main problem was the chance that one test run screws up the environment for the next one and starts a string failures that have to be manually rerun later.)

provide instructions to setup a dedicated Snabb CI machine

This could also be a reasonable approach if we think it makes life easier and if the dedicated machine can actually be a VM.

But wouldn't it create more work in practice? This machine will need to be cared for: assigned hardware, assigned a name, assigned an address where it can be reached, assigned ssh keys for login, etc. Then we will also need to deal with operational issues like whether its disk gets full.

To me it seems operationally simplest to run tests in one-shot environments (docker, qemu) that can run stand-alone on an existing Linux host with minimal impact. This has always worked really well for me with Vagrant on other projects. (Such a pity that Vagrant was not built on QEMU from the beginning...)

This could also be a reasonable approach if we think it makes life easier and if the dedicated machine can actually be a VM.

But our tests already use QEMU (I would assume we cannot run QEMU within QEMU?), and they also need dedicated hardware (NICs). The performance regression tests also should be run on a relatively idle node to yield representative results. I imagine they have their dedicated Jenkins farm that they do not want to equip with 10G NICs. But that's just my imagination, maybe we should get a better picture of how their setup looks.

How about if we assume the test environment is:

Linux distribution of some kind (e.g. Ubuntu, RHEL, NixOS).
Has CPU cores and 82599 NICs available.
Is administrated by somebody else.

Then our mission would be to provide instructions on how to install and run the tests on the Unix shell. This should be as simple as possible e.g. one shell command. Then we should be able to drive this either manually or via SnabbBot or via Jenkins job.

Is that enough of a picture?

If on the other hand we would like to require that the test environment is a complete machine that we will control every detail of, e.g. install our own operating system on, then I think we would need to be prepared to run inside a VM and rely on PCI-passthrough to make 82599 NICs available and we would have to look into making QEMU-in-QEMU work with benchmark-quality performance (a can of worms).

I do imagine that we could run the OpenStack test suite with QEMU-in-QEMU actually. This feels safer because we are only testing functionality and not performance.

My notion is that the outer QEMU would run NixOS and the inner QEMU would run the individual VMs that OpenStack is starting.

I suppose there are alternatives e.g. to deploy OpenStack in a container instead of a VM. @domenkozar what do you think?

I think the goal to success here is to have clearly written what the tests will involve and in what environment are we going to run them.

I'm packaging OpenStack for NixOS. NixOS can either be installed on a physical machine or we can run it inside a container or VM.

But wouldn't it create more work in practice? This machine will need to be cared for: assigned hardware, assigned a name, assigned an address where it can be reached, assigned ssh keys for login, etc. Then we will also need to deal with operational issues like whether its disk gets full.

NixOS tests take care about all of that and qemu machine is destroyed at the end. I think we could easily use Nix to run tests and only use/try "qemu in qemu" approach for OpenStack tests.

We're using a similar approach for testing VirtualBox in NixOS (it's virtualbox inside qemu): https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/virtualbox.nix#L210

snabbnfv-goodies / snabbswitch

Easy-to-deploy CI #1