conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.19k stars 977 forks source link

[question] Best practices for handling test data in a CI pipeline #17133

Open AndreasAckermannTSystems opened 4 days ago

AndreasAckermannTSystems commented 4 days ago

What is your question?

Hi everyone,

I'm working on establishing Conan packages and a CI pipeline for a product line of applications built from a shared set of modules, each contained in its own repository. These modules (e.g. modA and modB) all access a common database via classes contained in a module called orm.

The orm repository also contains database dumps for a test-database, an import script, and a database configuration file used by our applications unit tests. modA and modB's tests expect an initialized test database, and to have been provided this configuration file in a well-known location relative to their test binaries.

The CI pipeline runs in ephemeral Docker containers for each repository, and as such, the test database needs to be recreated by importing the dumps on each run.

My current intended approach is the following:

Are there best practices / better ways to handle test data with Conan in a CI pipeline setting?

Have you read the CONTRIBUTING guide?

memsharded commented 4 days ago

Hi @AndreasAckermannTSystems

Thanks for your question

As a general guideline for CI at scale, the ongoing work in https://github.com/conan-io/docs/pull/3799 might be useful, hopefully it can be published soon, but you might be able to generate the docs locally. This is not really about your questions, but it might be useful for the general issues of defining a CI pipeline.

Regarding your question, indeed you could put more artifacts inside the orm package, but as you pointed out, the size of the dump and the other files might be relevant, specially if you use the orm library artifacts very often without those test artifacts.

If the balance points that this could be a real problem, then there could be some alternatives to consider, like storing the test artifacts in a separate package that can be used as test_requires, or maybe using the "package metadata files" feature. But I think I'd probably start by putting things in the orm package and learn from there (unless you tell me the DB test dump would be like GBs in size)

if a CI=1 environment variable is detected

In general, it is better for Conan to model things more explicitly, like using Conan conf mechanism, the idea is that things can be easily reproduce locally, and tests executed by developers in their machines just by conan install ... -c user.myorg:build_tests=True or something like that. And also, otherwise, you can easily run fast jobs in CI that don't run those heavy tests, but might run other tersts. Note there are also some built-in confs like tools.build:skip_test that could be used in recipes already.

So having a bit more info about the test artifacts sizes and patterns/frequency of usage, could help deciding in one direction or another.

memsharded commented 3 days ago

Another important aspect to take into account would be the time of building the orm thing. If it is fast enough, then it wouldn't be a concern to just re-build things to create a separate orm_data package, separate from the orm one containing the actual libraries. That orm_data package could be used for example as test_requires.