glondu / belenios

Verifiable online voting system. This is a mirror of https://gitlab.inria.fr/belenios/belenios
https://www.belenios.org
GNU Affero General Public License v3.0
133 stars 21 forks source link

Add continuous integration using Gitlab-CI #2

Closed swergas closed 6 years ago

swergas commented 6 years ago

You can see the details about how this script ran here: https://gitlab.com/swergas/swergas-belenios-ci/-/jobs

Full documentation in French is here: https://hackmd.io/MY0_U_omQMGNK52Zws9c7g

glondu commented 6 years ago

Thank you!

I followed your instructions (in a fresh Debian 9 virtual machine) to run the CI job locally with the tip of your gitlab-ci branch and it timed out after 30 minutes. Looking further, this is due to lack of entropy in the runner. Prefixing make all and make check with BELENIOS_DEBUG=1 fixes that. Indeed, by default, belenios-tool uses secure random (/dev/random), which may exhaust the entropy pool when it is run many times (which is the case with make check). The BELENIOS_DEBUG environment variable at build time triggers a different code path that uses /dev/urandom instead. This way, running the CI job locally only takes 1 minute in my environment.

I agree it's a good idea to prepare a docker image with Belenios dependencies pre-installed. However, I don't think using the hash of opam-bootstrap.sh alone (for the tag name) is right. Indeed, some packages in the OCaml stack (OPAM packages) may evolve and I want to put minimal version constraints. At the moment, only Eliom is constrained. I think the docker image should be re-generated regularly to pick up new versions of unconstrained OCaml libraries. I suggest using $DATE-$HASH instead of just $HASH for the tag name. Or we could just use an integer as a version number that we would increment at each docker image generation.

Out of curiosity, I tried with ef0aca2 + my patch adding BELENIOS_DEBUG and the installation of ocsigenserver fails. Do you confirm that the whole CI job worked with this commit at some point for you? Maybe the failure is new and unrelated to your work.

There is an Inria Gitlab instance, where you should be able to create an account. I'm planning to put Belenios's repository there and set up CI there so that we depend only on Inria infrastructure.

swergas commented 6 years ago

I just tried running gitlab-runner in a new Debian 9.5 virtual machine. Indeed I also had the timeout issue:

ERROR: Job failed: execution took longer than 30m0s seconds
Fatal: Execution took longer than 30m0s seconds

But this issue happened as the docker image was not even finished downloading (my wifi connection downloads this image in more than 30 minutes). So I decided to try again after having downloaded the docker image (docker pull swergas/beleniosbase:efa5df3049f736dd34eb8289da730dd709eb99939f6511fa93ae0080a61ce4fb). Then I ran gitlab-runner again and everything worked well, and the job succeeded without errors nor timeout. So I cannot reproduce any entropy issue.

About the naming of the docker image, I will think about it and come back to you soon :)

swergas commented 6 years ago

I tried again executing gitlab-runner in this VM, and I have been able to reproduce the timeout issue you encountered. I read some articles about differences between random and urandom and now I understand better why the use of random could slow down things a lot when used extensively. Do you think it is acceptable to use urandom instead of random for Belenios CI tests? I haven't found yet a way to increase this 30 minutes timeout (when executed locally).

swergas commented 6 years ago

I have also been able to reproduce the issue you mentioned about installation of ocsigen in commit https://github.com/glondu/belenios/commit/ef0aca229b996361cebda8d30372faef2df772f7 in my VM. This is strange, I did not have this problem previously (I was installing from a docker image on my machine).

swergas commented 6 years ago

OK maybe I see, image ocaml/opam2:debian-9 has been updated 2 days ago, and now uses ocaml version 4.07.0 instead of 4.06.1 when I ran it. I'm investigating.

swergas commented 6 years ago

OK I tried image ocaml/opam2:debian-9-ocaml-4.06 (which has ocaml version 4.06.1 and opam version 2.0.0), and everything worked correctly. I edited the .gitlab-ci.yml file to show this.

glondu commented 6 years ago

Do you think it is acceptable to use urandom instead of random for Belenios CI tests?

Yes.

I've put Belenios on Inria's Gitlab instance and set up CI. I had to use the BELENIOS_DEBUG=1 trick to make the pipeline pass.

Using a fixed image with (a snapshot of) all dependencies of Belenios preinstalled is a good idea to test changes in Belenios itself. However, it is also a good idea to spot early breakages due to changes in dependencies. For this, I use the opam-bootstrap.sh script. Is it possible/reasonnable to you to set up a second pipeline that would start with a simple debian:9 image? (as you did in your first commit?) This pipeline would just test opam-bootstrap.sh, thorough tests (UI...) would still be on the first pipeline.

swergas commented 6 years ago

Yes OK, here it is: https://gitlab.com/swergas/belenios-inria/pipelines (branch: https://gitlab.com/swergas/belenios-inria/commits/gitlab-ci-both-images)

swergas commented 6 years ago

Remaining work:

swergas commented 6 years ago

Work continues here: https://gitlab.inria.fr/belenios/belenios In particular this PR for CI documentation: https://gitlab.inria.fr/belenios/belenios/merge_requests/1

Remaining work: