Open CJ-Wright opened 6 years ago
It would be good to separate the actual operation (the pulling of data and updating of files) from the pushing back to repos.
@CJ-Wright Indeed, we should really rethink this ! Is there any new development ? I can think about this if possible (I can add to my GSoC program as a form of debug/test codes)
There weren't any recent developments on this front to the best of my knowledge
I want to revive this issue and came up with the following concept:
For a proper integration test strategy, we must mimic the relevant GitHub accounts and repositories with which the bot interacts. I propose the following accounts ("test accounts") and repositories ("test repositories"):
conda-forge-bot-staging
(organization) mimics the conda-forge organization and will contain a selection of test feedstocks (see below how we create them)regro-cf-autotick-bot-staging
(user) mimics the regro-cf-autotick-bot account and is a test environment in which the bot will create forks of the conda-forge-bot-staging
repositoriesregro-staging
(organization) (named after the regro account) contains a special version of the cf-graph-countyfair which the bot uses during testing. See below how we prepare the graph for testing.I am aware this requires us to manage three additional GitHub entities. However, since production also uses three accounts this way, we should stick to this architecture and mirror it as closely as possible, preventing future headaches.
To define test cases, we use the following directory structure:
definitions/
├── pydantic/
│ ├── resources/
│ │ ├── recipe_before.yml
│ │ └── ... (entirely custom)
│ ├── version_update.py
│ ├── aarch_migration.py
│ ├── some_other_test_case.py
│ └── ...
├── llvmdev/
│ ├── resources/
│ └── test_case.py
└── ...
As shown, there are different test cases for different feedstocks. Each test case is represented by a Python module (file). Each test case module must define a prepare()
and a check_after()
method.
The prepare method uses our yet-to-be-implemented integration test library (see below) to set up the test case by defining how the feedstock repo, the forked repo of the bot account (does it even exist?), the cf-graph data relevant to the feedstock, and possibly needed HTTP mocks (see below) look. Setting up the repositories includes preparing PRs that might be open.
The check_after method is called after the bot is run and can throw several assertions against the test state (e.g., files present in the forked repository, a specific git history, cf-graph data). Helper functions provided by our integration test helper library make writing those assertions easy.
We run the integration tests via a single GitHub workflow in this repo. It consists of the following steps:
jq
. If needed, old branches generated by previous test runs are deleted. As pointed out above, the test data is generated by each test case's prepare
method. HTTP mocks are also set up.check_after
method to validate the state after the bot run.To emphasize, we test multiple feedstocks together in each test scenario. This speeds up test execution (because the bot works in a batch job fashion) and might uncover some bugs that only occur when multiple feedstocks or their cf-graph metadata interact with each other in some way.
The version update tests, especially, will require us to mock HTTP responses to return the latest versions we want them to return. ~To accomplish this, we use VCR.py
cassettes that have been modified accordingly. If possible, we might use the pytest-recording pytest plugin on top of that.~ We cannot use VCR.py because we want to reuse the bot's workflows, and VCR.py is not a true web proxy, only instrumentation around Python mocks. So I propose something like MockServer.
The test scenarios are generated by dynamically parametrizing a pytest test case. This pytest test case runs once per test scenario, dynamically importing the correct Python modules (test cases) for each feedstock that is part of the test scenario and then executing them.
The integration test helper library provides helper functions for the prepare()
and check_after()
functions. For example, we might give a function setup_from_resources
that copies a pre-defined feedstock from a resources folder (see "Integration Test Definition" above) into the test feedstock repository.
For check_after, we could provide helper functions for checking that a GitHub PR has been opened on the test feedstock repository with the correct title or another function for checking that the contents of the bot's fork has the expected content.
The integration test library must offer an option to run conda-smithy
rerenders. The results of these operations can be cached using GitHub Actions Caches, respecting the conda-forge.yml, recipe/
contents, and conda-smithy version.
Practice will show which exact helper functions are necessary.
Let me know what you think!
@beckermr
Cc @0xbe7a
I need to read more, but the idea of a custom integration test library sounds painful. It is entirely possible to do this within pytest and we should do so.
We might have misunderstood each other here. The goal of the "integration test helper library" is not to replace pytest or parts of it. It's simply a collection of practical helper functions to set up the text environment, including configuring the external git repos. It also offers custom functions for assertions we need in our use case, e.g., validating the contents of a pull request. Since this is very domain-specific, it cannot be done by an already-available public library.
We may need to run true integration (with a dev graph to boot) so we don't blow up the graph by accident and can run CIs with close to 100% coverage.