Smoke or E2E tests - Githubissues

gauravgahlot commented 4 years ago

The idea is to have smoke or E2E tests that:

setup a provisioner and a worker with Vagrant
execute a workflow
be part of the CI pipeline

Initial tests may cover the following workflows:

hello-world
- services covered - tink, boots, osie
install Alpine Linux on worker
- services covered - tink, boots, osie, hegel, cacher

gauravgahlot commented 4 years ago

@mrmrcoleman @grahamc Please update the description if I've missed anything from our discussion.

mrmrcoleman commented 4 years ago

Hey @gauravgahlot it would also be good to consider which events should trigger the CI. The end to end covers all (?) of the services, so does it run whenever any of the repos change?

mrmrcoleman commented 4 years ago

@gauravgahlot Which example are you going to use for the alpine work? (install Alpine Linux on worker)

gianarb commented 4 years ago

@gauravgahlot did you already select a target platform for CI/CD? I doubt GitHub action, Circle-CI, Travis allowed to use vagrant.

Maybe @grahamc knows more because he tweeted about this

There is also another thing to point out and it is around workflow. It is not a safe approach to run e2e tests for every PR without a first triage and it can turn out to be a waste of time. So kubernetes has some automation, when a PR gets open they look at it and they add a label (they comment but its the same at the end) and the only PR with that label can be tested. Does it sound reasonable? And if we want to simplify the flow we can use Packet and not Vagrant, for now, I think it will be easier to set up because we can run it from GitHub Actions, Travis, Drone, or whatever.

Ideally, I would like to run those tests against all the platforms we support: Vagrant AND Packet AND whatever will come.

mmlb commented 4 years ago

our internal https://drone.packet.net exists solely to run VMs (for osie tests) :D, we can use it.

mmlb commented 4 years ago

But its an older version of drone. We can fire up a separate setup off of latest drone.io version and that way also keep external stuff off of our internal build hosts.

I can't recall if semaphore or circle ci let us bring our own build agents and thus allow us to run VMs in them.

gianarb commented 4 years ago

@mmlb do you think it is useful to setup e2e tests in a way that they can support multiple platform? Right now we have

Vagrant
Packet

Tomorrow we will may have something else and the workflow I am picturing is the one we have for kubernetes.

PR comes in somebody review it and when it is ready to get tests e2e it adds the label
When the label is there one job per platform starts and reports its status
You can retest with a comment like:

/retest vagrant
/retest packet

Ideally, this does not even need a runner that supports VMs because we will reach packet-api and use a device (or more based on the tests itself) for every build.

Does it have a sense? Or is it too much? 😄

mrmrcoleman commented 4 years ago

My 2 cents.

For Vagrant we'd want to test it in the way the user would use it. This means we'll need a build runner that supports virtualization.

I think testing the actual workflows might be tricky, but happy to think through it with you.

gianarb commented 4 years ago

Today I got some time and I tried the Getting Started with Vagrant tutorial inside a Packet machine. I got it running apart for the worker that requires a GUI and we do not have it:

There was an error while executing `VBoxManage`, a CLI used by Vagrant
for controlling VirtualBox. The command and stderr is shown below.

Command: ["startvm", "19c0ccd6-9a38-4ee0-9330-29e413387eee", "--type", "gui"]

Stderr: VBoxManage: error: The virtual machine 'vagrant_worker_1594815205261_61888' has terminated unexpectedly during startup because of signal 6
VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component MachineWrap, interface IMachine

How I did it

Create a machine on packet

packet device create --operating-system ubuntu_20_04 --plan c1.small.x86 --project-id f49cf84 --hostname tink-e2e --facility ams1

ssh into it and run:

#!/bin/bash

apt-get install -y linux-generic linux-image-generic linux-headers-generic virtualbox

# By default it looks for libvirt for some reason
export VAGRANT_DEFAULT_PROVIDER=virtualbox

At this point, you can follow the Getting Started until the end.

I tried to change the Vagrantfile in order to setup the worker without a GUI. I left this command running for a bit (~3m) and I killed the command:

$ vagrant up worker
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...
    worker: Warning: Authentication failure. Retrying...

I accessed the provisioner again to check the workflow status:

$ vagrant ssh provisioner
$ vagrant@provisioner:~docker exec -i deploy_tink-cli_1 tink workflow events c61e5327-5ed3-4a48-a314-533ebec222d9
+--------------------------------------+-------------+-------------+----------------+-------------------+--------------------+
| WORKER ID                            | TASK NAME   | ACTION NAME | EXECUTION TIME | MESSAGE           |      ACTION STATUS |
+--------------------------------------+-------------+-------------+----------------+-------------------+--------------------+
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | hello world | hello_world |              0 | Started execution | ACTION_IN_PROGRESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | hello world | hello_world |              0 | Action Failed     |      ACTION_FAILED |
+--------------------------------------+-------------+-------------+----------------+-------------------+--------------------+

it failed but I am sure it is just a timing issue

Follow up

This was just a way to keep me busy for a bit. I think we should write e2e tests that can run anywhere and set up a table grid like this one https://testgrid.k8s.io/ where we can potentially write the same e2e tests across different providers like:

Vagrant libvirt
Vagrant virtualbox
Packet + Terraform
VMWare
OpenStack
who knows

gauravgahlot commented 4 years ago

I tried executing an Ubuntu provisioning workflow on my local Vagrant setup without GUI. It was successful.

vagrant@provisioner:/vagrant/deploy$ docker-compose exec tink-cli tink workflow events 096f328e-6286-4aff-ada6-93e8eec38f96
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
| WORKER ID                            | TASK NAME       | ACTION NAME     | EXECUTION TIME | MESSAGE                         |      ACTION STATUS |
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | disk-wipe       |              0 | Started execution               | ACTION_IN_PROGRESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | disk-wipe       |             16 | Finished Execution Successfully |     ACTION_SUCCESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | disk-partition  |              0 | Started execution               | ACTION_IN_PROGRESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | disk-partition  |              4 | Finished Execution Successfully |     ACTION_SUCCESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | install-root-fs |              0 | Started execution               | ACTION_IN_PROGRESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | install-root-fs |             22 | Finished Execution Successfully |     ACTION_SUCCESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | install-grub    |              0 | Started execution               | ACTION_IN_PROGRESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | install-grub    |             13 | Finished Execution Successfully |     ACTION_SUCCESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | cloud-init      |              0 | Started execution               | ACTION_IN_PROGRESS |
| 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 | os-installation | cloud-init      |              0 | Finished Execution Successfully |     ACTION_SUCCESS |
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+

Then to access the worker I had to do:

vagrant halt worker

// set GUI to true in Vagrantfile

vagrant up worker

tinkerbell / tink

Smoke or E2E tests #169

How I did it

Follow up