openml / server-api

Python-based server
https://openml.github.io/server-api/
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Look into using `testcontainers` to orchestrate `docker` for tests #103

Open PGijsbers opened 1 year ago

PGijsbers commented 1 year ago

We currently rely on manual commands to initialize the docker containers used for testing (docker compose up), but we may be able to leverage the testcontainers module to orchestrate this for us.

PGijsbers commented 11 months ago

Did a small benchmark. It takes ~1.2sec for the MySQL test database container to start and be ready for connections. I ran

import pytest

@pytest.mark.parametrize(
    "_",
    list(range(1000))
)
def test_to_measure(_, my_database) -> None:
    assert True

with the fixture defined as

import pytest
from testcontainers.core.container import DockerContainer
from testcontainers.core.waiting_utils import wait_for_logs

@pytest.fixture()
def my_database() -> DockerContainer:
    with DockerContainer("openml/test-database") as db_container:
        wait_for_logs(db_container, "mysqld: ready for connections.", interval=0.2)
        yield db_container

Without fixture it took 0.33 sec, and with 1253 sec. So that translates to roughly ~1.2 sec overhead. This is a workable solution for those tests that really need an independent container to connect to (such as those that modify the database through the PHP api). There might also be cases where this can be employed on a module level or similar, to reduce the overhead. However, the overhead is prohibitive to run this on an individual test level in quick test cycle, so it has to be used sparingly for tests which need to be executed frequently. A potential use is for more extensive testing in CI.

PGijsbers commented 11 months ago

We would also have to modify our workflow or images slightly. Currently, tests are executed from the python-api docker container, but to use testcontainers it would mean we have some docker-in-docker setup or sharing the docker socket from the host. This might increases the setup complexity for (new) developers. Alternatively, we could expect to run tests from a local python-api installation (as opposed to containerized), but then users will have to install the mysql-client dependencies.