marvel-nccr / quantum-mobile

A Virtual Machine for computational materials science
https://quantum-mobile.readthedocs.io
Other
91 stars 32 forks source link

Run more thorough tests of various parts of the machine #129

Open borellim opened 4 years ago

borellim commented 4 years ago

To start, here is a testing procedure for Siesta: https://github.com/marvel-nccr/ansible-role-aiida/issues/43

ltalirz commented 4 years ago

Just to mention that there is a bit of a "mix" of approaches here at the moment.

Tests often take a lot of time (and sometimes produce extra files), which is why they are not run as part of a regular execution of a role.

Some roles check for whether a run_tests variable has been set to True (this was the original approach, see also https://github.com/marvel-nccr/quantum-mobile/blob/34d60e0d2a9e56ab1475549214d1cacd4dbc7544/playbook.yml#L28). Since the introduction of CI tests, the tests have moved to the molecule/tests/verify.yml task list (e.g. https://github.com/marvel-nccr/ansible-role-aiida/blob/master/molecule/default/verify.yml), which is run as a final step during CI.

The second approach means that it isn't as straightforward to run the tests on a fully built Quantum Mobile (perhaps there is a way, I haven't checked) but I think it's the cleaner and more scalable approach.

chrisjsewell commented 4 years ago

note the full build is now tested on pushes to master or tag/release creation: https://github.com/marvel-nccr/quantum-mobile/actions/runs/308356597 😄

This kind of closes this issue, although we should probably decide (and document) a consistent approach to using molecule/tests/verify.yml and run_tests: true. I feel run_tests: true is better, because we can run it during deployment and not just via molecule. @ltalirz why do you feel verify,yml is more scalable?

ltalirz commented 4 years ago

I feel run_tests: true is better, because we can run it during deployment and not just via molecule. @ltalirz why do you feel verify,yml is more scalable?

My remark was not so much about the technical solution but rather that it is not scalable to use the (very time consuming) Quantum Mobile build to detect issues with individual roles (which is what we did in the very beginning before I started testing roles individually). We should try as much as possible to run tests as part of the CI of the individual roles, and only run tests during the full Quantum Mobile build if they cannot easily be tested inside an individual role.

Ideally, this would make it obsolete to e.g. run fleur tests during a build of the quantum mobile (e.g. I would suggest to actually turn run_tests off by default and switch it on only when we do releases).

The verify.yml approach is used by many others (e.g. the geerlingguy roles), while the run_tests approach was added by either me or Giovanni by hand before we had any CI testing.

chrisjsewell commented 4 years ago

I guess my main concern is if you can truly say/assume that: because these tests pass when installing the role into an empty docker container (i.e. the role's molecule testing), this means they will also pass when run as part of a full image build, where the VM will have likely gone through many other modifications before reaching a particular role.

For a production build of the VM, I'd rather have to wait a longer time (within reason) to include these test tasks, if it gives greater assurance that the final VM will have no issues.

ltalirz commented 4 years ago

For a production build of the VM, I'd rather have to wait a longer time (within reason) to include these test tasks, if it gives greater assurance that the final VM will have no issues.

I think we agree on this - for releases, one can afford to run even long-running tests (and, of course, interaction between roles can cause problems). At the same time, I believe the low-hanging fruit at the moment are to add more tests at the role level to make it more likely that things during the VM build work as expected.

You can use either the run_tests approach or the verify approach for this, in the end I think it what matters most is to add the tests in some form. One issue with the run_tests approach is that tests often leave stuff behind, so you then also need to implement the corresponding cleanup which can be difficult (imagine e.g. running an AiiDA calculation). And you will probably want to have some logic that prevents (long-running) tests from running twice when the results are already there (e.g. during the idempotency check). These issues do not exist for the verify approach, since the verify stage runs only once and the result is thrown away.

In the end, I think there are going to be tests that are prone to role interference, where using run_tests makes sense, while there are other tests can just as well run only on the role directly.