patch0 commented 7 years ago

At the moment the install / dist-upgrade / upgrade tests get weirdly-far in gitlab-ci then fails. Here's a quick summary of how the tests used to work on maker2 (as I understand it), and then later I'll go into detail on why the tests fail in gitlab-ci

The Current Situation

autotest sets up a VM using schroot magic i don't fully understand
the VM boots up with systemd and all that jazz, uses DHCP & SLAAC to configure its networking, and automatically runs all the scripts in the autotest folder, keeping all the output as a log file. Once done, it shuts down
autotest cracks open the VM's filesystem and reads the logfile. Somehow it detects failures and fails if there was a failure then it exits with a nonzero error code so that maker2 knows

Some Feelings About The Current Situation

Patrick said something about autotest using the console to talk to the tests, and something else much scarier about the VM sshing into the host to run something.

This doesn't work on gitlab-ci, and is also kinda hacky, for a few reasons.

the scripts in the autotest folder aren't particularly focussed. In addition to actually running tests, they do these and probably others:
- add an admin user
- install all the packages needed by symbiosis from a big list of packages
- install symbiosis
opening up the filesystem of the VM so you can prod it is pretty gross

On the plus side it works, and it would only take a bit of effort to port the whole schroot setup over to gitlab-ci (but would have to run using a shell runner)

Why the tests fail in gitlab-ci

When gitlab-ci runs a container it starts bash in the context of the container. Effectively, bash is PID 1 for the container. There's no init-system to talk to to get stuff going. I believe the apt-get install step for some packages starts them using /etc/init.d (probably something about the package detecting a lack of systemd and putting a proper sysvinit script in) which would explain why a lot of the tests actually succeed. BUT SOME OF THEM FAIL, and we should really be doing a much more realistic test than running our symbiosis full-system tests in a docker container that isn't a full symbiosis system.

With that in mind:

A More Realistic Test Proposal

We're still going to want to run symbiosis in a VM, I think. To do a realistic full-install / dist-upgrade test we need to have a realistic system, which the docker container environment isn't. We need a systemd to talk to so we can schedule restarts, that sort of thing.

We will need some test-specific configurations (particularly repo URLs) too. And we'll need to be able to orchestrate the testing and fail the build when the tests fail.

We could create an image prior to the testing which would have a user account with passwordless sudo and a .ssh/authorized_keys . The private key would be kept in the secret variables section of the project on gitlab, and so would be presented to the gitlab-ci script as an env var.

In the gitlab-ci script we'd start the VM with qemu, as we do for bytemark/bytemark-packer-templates, then use ansible to copy over the tests, install the symbiosis packages, and run the tests. We could write our ansible playbook so that it captures the logs and copies them back to the runner and have the gitlab-ci script spit the logs out, then exit with ansible's exit code.

This would make our test output more readable and shorter, not be quite as weird the current autotest setup on maker2, probably not require also running a DHCP server.

The work we'd need to do:

add an ansible layer to docker-images/layers
rewrite the autotest/ scripts as ansible playbooks
make a base VM image with the necessary networking & ssh setup

Thoughts @pcherry , @jcarter ?

Originally reported on Bytemark's Gitlab by @telyn on 2017-03-09T16:21:43.733Z

patch0 commented 7 years ago

(the schroot is already on both maker2 and gitlab-ci, useable as the shell runner..)

Originally posted by @patch0 on 2017-03-13T17:07:44.115Z

patch0 commented 7 years ago

In that case I would vote for using that until using it is a pain - I don't see any real benefit to reinventing this particular wheel?

Originally posted by @telyn on 2017-03-14T09:26:34.751Z

patch0 commented 7 years ago

i have managed to get sautotest running on maker2. but only for install. dist-upgrade and upgrade both fail for reasons I don't fully understand nor particularly care to right now

Originally posted by @telyn on 2017-04-05T15:02:20.152Z

pcherry commented 7 years ago

Hey guys -- I think you have the wrong person in this thread....no idea what you guys are discussing.

On Tue, Jun 13, 2017 at 9:31 AM, Patrick Cherry notifications@github.com wrote:

At the moment the install / dist-upgrade / upgrade tests get weirdly-far in gitlab-ci then fails. Here's a quick summary of how the tests used to work on maker2 (as I understand it), and then later I'll go into detail on why the tests fail in gitlab-ci The Current Situation

autotest sets up a VM using schroot magic i don't fully understand

the VM boots up with systemd and all that jazz, uses DHCP & SLAAC to configure its networking, and automatically runs all the scripts in the autotest folder, keeping all the output as a log file. Once done, it shuts down

autotest cracks open the VM's filesystem and reads the logfile. Somehow it detects failures and fails if there was a failure then it exits with a nonzero error code so that maker2 knows

Some Feelings About The Current Situation

Patrick said something about autotest using the console to talk to the tests, and something else much scarier about the VM sshing into the host to run something.

This doesn't work on gitlab-ci, and is also kinda hacky, for a few reasons.

the scripts in the autotest folder aren't particularly focussed. In addition to actually running tests, they do these and probably others:

add an admin user

install all the packages needed by symbiosis from a big list of packages

install symbiosis

opening up the filesystem of the VM so you can prod it is pretty gross

On the plus side it works, and it would only take a bit of effort to port the whole schroot setup over to gitlab-ci (but would have to run using a shell runner) Why the tests fail in gitlab-ci

When gitlab-ci runs a container it starts bash in the context of the container. Effectively, bash is PID 1 for the container. There's no init-system to talk to to get stuff going. I believe the apt-get install step for some packages starts them using /etc/init.d (probably something about the package detecting a lack of systemd and putting a proper sysvinit script in) which would explain why a lot of the tests actually succeed. BUT SOME OF THEM FAIL, and we should really be doing a much more realistic test than running our symbiosis full-system tests in a docker container that isn't a full symbiosis system.

With that in mind: A More Realistic Test Proposal

We're still going to want to run symbiosis in a VM, I think. To do a realistic full-install / dist-upgrade test we need to have a realistic system, which the docker container environment isn't. We need a systemd to talk to so we can schedule restarts, that sort of thing.

We will need some test-specific configurations (particularly repo URLs) too. And we'll need to be able to orchestrate the testing and fail the build when the tests fail.

We could create an image prior to the testing which would have a user account with passwordless sudo and a .ssh/authorized_keys . The private key would be kept in the secret variables https://gitlab.bytemark.co.uk/open-source/symbiosis/settings/ci_cd section of the project on gitlab, and so would be presented to the gitlab-ci script as an env var.

In the gitlab-ci script we'd start the VM with qemu, as we do for bytemark/bytemark-packer-templates, then use ansible to copy over the tests, install the symbiosis packages, and run the tests. We could write our ansible playbook so that it captures the logs and copies them back to the runner and have the gitlab-ci script spit the logs out, then exit with ansible's exit code.

This would make our test output more readable and shorter, not be quite as weird the current autotest setup on maker2, probably not require also running a DHCP server.

The work we'd need to do:

add an ansible layer to docker-images/layers

rewrite the autotest/ scripts as ansible playbooks

make a base VM image with the necessary networking & ssh setup

Thoughts @pcherry https://github.com/pcherry , @jcarter https://github.com/jcarter ?

Originally reported on Bytemark's Gitlab https://gitlab.bytemark.co.uk/open-source/symbiosis/issues/57 by @telyn https://github.com/telyn on 2017-03-09T16:21:43.733Z

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BytemarkHosting/symbiosis/issues/53, or mute the thread https://github.com/notifications/unsubscribe-auth/AByqBhytB2hEISXYOswqWZYRjz-TuyLVks5sDo8agaJpZM4N4fl7 .

patch0 commented 7 years ago

Sorry @pcherry username clash between your username on GitHub, and my username on our internal Gitlab system when I moved this issue across! Sorry for the noise :)

BytemarkHosting / symbiosis

Testing in gitlab #53

The Current Situation

Some Feelings About The Current Situation

Why the tests fail in gitlab-ci

A More Realistic Test Proposal