tests: easy and reproducible CI builds locally

inveniosoftware / invenio

Invenio digital library framework

https://invenio.readthedocs.io

MIT License

626 stars 292 forks source link

tests: easy and reproducible CI builds locally #4037

Closed diegodelemos closed 4 years ago

diegodelemos commented 4 years ago

Example: CI system (let’s say Travis) passes in all builds except for MySQL

Currently, developers have a hard time reproducing this environment as they have to run MySQL locally (usually through Docker) and as this is done ad-hoc many problems can appear such as running incorrect versions or wrong configurations. We shall make it easier by using the following approaches:

Enhance error messages: Detect what external dependencies the tests need and if they are not met inform with an accurate error message which tells developers what to do (e.g. tests require Elasticsearch, tests check if Elasticsearch is running and the version is correct if not, fail and recommend to do docker run docker.elastic.co/elasticsearch/elasticsearch-oss:7.2.0)
Enhance run-tests.sh: developers just specify what do they need to run tests and the script sets up the environment (by means of Docker containers) (note that if we take this option, the same script should be used in Travis so the setup is the same in CI environment and locally).

ppanero commented 4 years ago

Use cases

✅ As a developer I want to reproduce a specific combination of the Travis matrix in my local setup.

The developer would export the variables declared in the default .env with the desired values before running tests. Alternatively, if there were many variables to overwrite, the developer would could add a new .env that will override the default and update ./run-tests.sh with docker-compose --env-file=./.env up -d(This environmental file should contain all variables, not just overwritten ones).

✅ As a developer I want to debug my code, with the services of a specific combination of the Travis matrix in my local setup.

The developer should set a breakpoint (e.g. with ipdb). Then either modify the run-tests.sh script to add the -s parameter to pytest; or boot up/down the services using docker compose and then call the specific test (e.g. pytest -s tests/test_module::test_function

✅ As a developer I want to run multiple times a single tests module/function. Services should not be boot up and down on every run, but be kept up.

Same as the previous use case.

🕐 As an Invenio maintainer I want to control the service's version from a centralized point. e.g. ES_7_LATEST might change from 7.2.0 to 7.3.0.

Several approaches:

Play with dotenv files variables. The blocking point was the lack of variable expansion. A dotenv-expand npm package exists (It was not chosen due to the mix of technologies and over-engineering that means the use of an external module for this task).
[CURRENT] Replace the dotenv file by a bash script.

🕐 As an Invenio maintainer I want to see the effect of changing (upgrading/downgrading) a service version on the whole Invenio organization modules.

Task not started. Will use a script used by the pip group (see gist from @frankois).

Missing tasks:

We are currently using psql as name for PosgreSQL service related variables. However, setup.py's use postgresql. We need to find a naming concensus.
Integrate services from external repo, e.g. by enriching run-tests.sh to pull the files if they do not exists, or using Pytest hooks.
Tackle the last use case.

diegodelemos commented 4 years ago

✅ As a developer I want to reproduce a specific combination of the Travis matrix in my local setup.

The developer would export the variables declared in the default .env with the desired values before running tests. Alternatively, if there were many variables to overwrite, the developer would could add a new .env that will override the default and update ./run-tests.sh with docker-compose --env-file=./.env up -d(This environmental file should contain all variables, not just overwritten ones).

Can we just test the existence of the .env file in the current directory, and if it exists, add this parameter? ($(test -e .env && echo "--env-file=./.env") or even simpler would work). In that case the only action taken by developers should be to create a .env file. (requires good docs)

✅ As a developer I want to debug my code, with the services of a specific combination of the Travis matrix in my local setup.

The developer should set a breakpoint (e.g. with ipdb). Then either modify the run-tests.sh script to add the -s parameter to pytest; or boot up/down the services using docker compose and then call the specific test (e.g. pytest -s tests/test_module::test_function

The second option would provide a better UX in my opinion. The UX would be even better if pytest-invenio checks that the needed services are up and running, and if it is not the case: either (1) provide a meaningful message with a one liner to fix the problem (the docker-compose run) or (2) it will spawn the services itself (this is related to your comment "Integrate services from external repo ... or using Pytest hooks".

[CURRENT] Replace the dotenv file by a bash script.

What if not only replace the dotenv with a bash script but also we wrap the whole thing (.env and docker-compose) with a higher level wrapper. More visually:

$ # your first option: .env gets picked up automatically
$ ls
docker-compose.yml .env
$ docker-compose up
$
$ # option two: .env -> env.sh
$ ls
docker-compose.yml env.sh
$ source env.sh
$ docker-compose up
$
$ # my proposal: .env + docker-compose -> docker-services.[sh|py]
$ ls
docker-compose.yml .env docker-services.[sh|py]
$ docker-services.[sh|py] up ES_VERSION=7.2.1

Note: if the idea makes sense, and the wrapper would be Python, it would make a potential integration with pytest-invenio easier.

ppanero commented 4 years ago

Can we just test the existence of the .env file in the current directory, and if it exists, add this parameter? ($(test -e .env && echo "--env-file=./.env") or even simpler would work). In that case the only action taken by developers should be to create a .env file. (requires good docs)

Sorry I was not really clear, this might be actually changed according to the outcome of .env into env.sh

The second option would provide a better UX in my opinion. The UX would be even better if pytest-invenio checks that the needed services are up and running, and if it is not the case: either (1) provide a meaningful message with a one liner to fix the problem (the docker-compose run) or (2) it will spawn the services itself (this is related to your comment "Integrate services from external repo ... or using Pytest hooks".

The hooks are being investigated

What if not only replace the dotenv with a bash script but also we wrap the whole thing (.env and docker-compose) with a higher level wrapper. More visually:

This we were until now doing it but instead of inside a specific docker-services.sh, just inside the run-tests.sh. Do you think we should have the two of them? Maybe the second calling the first, but the first doing the work in case the developer wants to debug/test specific things (use cases 2 and 3)

diegodelemos commented 4 years ago

This we were until now doing it but instead of inside a specific docker-services.sh, just inside the run-tests.sh. Do you think we should have the two of them? Maybe the second calling the first, but the first doing the work in case the developer wants to debug/test specific things (use cases 2 and 3)

Exactly, to be completely crystal clear:

Full test suit run:

$ run-tests.sh  # which callls `docker-services.sh <with-corresponding-params>

Single test run performed 1 or n times:

$ docker-services.[sh|py] <with-corresponding-params>  # manual step not needed if done by pytest
$ pytest -s module.py::func

ppanero commented 4 years ago

We have checked several 3rd party libraries: pytest-services, pytest-xprocess and the closes docker-services. The last one was almost suitable, however discarded because:

Link: https://github.com/dmonroy/docker-services

The library enables pytest to boot up and down services defined in pytest.ini or .services.yaml file. However, after testing it several inconvenients were found:

1- We cannot use the pytest.ini file since it is not flexible enough. We have to use the .services.yaml approach, however the syntax of this file is pure yaml. This means that it is not 1-to-1 matching with docker-compose files (e.g. lists for env variables are not prefixed with -). 2- The mentioned services file would not be centrally managed, but come with the module when cookiecutted. Then, even if versions can be centrally managed using env variables, service configuration cannot. 3- Since it is pytest who boots up and down the services, in the case of a developer testing constantly a module/function (e.g. pytest -s tests/test_example.py::test_function) all services would go up and down in each run. Note that when testing there was no noticeable difference in time once the images where cached. 4- Some services might require liveness checkes (e.g. Elasticsearch), since they are slow on startup. Since this module is hooked up to pytest, it requires investigation on how to do so + PR and release of the library (This is not necesarily bad, is actually good but we have no experience working with the maintainer and it is a single man project).

ppanero commented 4 years ago

@diegodelemos @KonstantinaStoikou

Status:

Fixed the variable overwrite from the env. Now if the user exports a variable, the cli will use it instead of the one define in env.py. It only supports the exported env vars, not a custom env.py. I think this we can leave as it is. I don't see a case with a loot of vars exported that might require a new env.py.

I tested locally, and it works on the change of ES7 and ES6. Should work with postgres vs mysql with minor modifications (specify correctly in run-tests.sh)

Next tasks:

We have a mismatch with invenio on naming. We are calling psql when invenio calles postrgresql. Maybe we need to converge to the second, since it would be a pain to change all invenio. (Just to be consistent)
If the user export ES_VERSION=6.7.0 it will work, but if export ES_VERSION=ES_7_LATEST it won't. We need to refactor a bit the env.py to parse the value received from the env and treat it accordingly...
Find a proper way of knowing when services are alive (avoid sleep's)
~~Clean up the module:~~
- ~~Add .travis.yml file~~
- ~~Pass isort, black, etc~~
Find a new name and release :rocket:
~~Test in several travis (without releasing we can only test devel)~~
~~Document how to migrate~~

KonstantinaStoikou commented 4 years ago

We checked the pytest-docker plugin (https://github.com/avast/pytest-docker) to be able to check when a service is running (not just if the container has been created) before running the tests. It actually checks if the service is running through a URL to the service which listens on a port. The use of this plugin needs to be investigated further at some point.

ppanero commented 4 years ago

@diegodelemos Documentation on how to migrate and current solution: https://codimd.web.cern.ch/k6qA6nMuQauZ16JAzePg4w?view