downstream testing to assert affect on major plugins

ssbarnea commented 4 years ago

In order to avoid pytest core to evolve without breaking every single plugin and reach a point where maintaining a set of working plugins becomes a burden we need to design a CI/CD pipeline that asserts effects of changes downstream (the its consumers).

We only need to pick few important plugins to test with but once we setup this, we will be able to make changes easier with less stress.

How we implement it is still an open question as github does not have a documented way to perform this.

AFAIK, Zuul CI is once of these systems that allow cross-repository testing, where a CR/PR is tested not only with jobs from a single project but also with jobs with other sibling projects. The result is that it guarantees that merging a change in foo project does not break the bar project.

I plan to investigate how github workflows can be used to implement something similar, maybe someone already written something that allows that.

If we do this we can also be able to build a list of actively maintained plugins, those that we know for sure to work well with latest release of pytest and also with master code from pytest. The current http://plugincompat.herokuapp.com/ is not really a very reliable source.

RonnyPfannschmidt commented 4 years ago

since we now have github actions, all we actually need is a new workflow with the matrix across the plugins we want to support and checking out a number of key refs/releases with a github checkout action

ssbarnea commented 4 years ago

How we would check a plugin other than running its own tests? Lots of them would need special preparation work in order to run, may even use different CI/CD jobs.

Anyway, we could try an experiment with one or two generic plugins: pytest-cov or pytest-html. If needed, we could ask plugin authors/maintainers to keep a special tox environment that is aimed for this kind of testing, so we would always have the same testing entry point. The real deal is when we end up testing pr-to-master! If we do this it means that if any of them is making a release it would not affect the other, regardless the order in which these are done.

bluetech commented 4 years ago

I think this is a great idea, and will improve the stability of the pytest ecosystem a lot around big releases, along with enabling us to get a hint at the impact of potential changes.

I suspect doing this on every pytest CI run will not be viable, especially if we wish to scale up the number of plugins we include in this (to e.g. 50-100 plugins). Some alternatives are:

Triggered manually
Run on a schedule e.g. once a day
Before releases
Triggered by some other event?

Then we should consider how to go about it. Some questions are:

Can we "distribute" it to the plugins themselves, i.e. have them run against pytest master? 1.1. Requires effort across many plugins instead of one central place -- harder. 1.2. Probably in a cron so it runs regularly even if the plugin itself hasn't changed. 1.3. How can it be reported reliably to the pytest repo / pytest devs? 1.4. Can work for plugins outside the pytest-dev org?

Assuming we want to do it centrally:

In pytest repo or a separate repo?
Separate CI job for each plugin, or one job which lists result?
How is each plugin tested? Some bash script which clones the plugin repo, patches it to run against pytest master, and runs its tests?

Hopefully the CI experts among us can share their experience on the best way to set up such an operation.

ssbarnea commented 4 years ago

I know how to do this using Zuul CI but I am not in a position to be able to propose switching our CI/CD or even enabling it as a 3rd party-ci (i know and use 3 big instances opendev, rdo and ansible, but I cannot make decision to approve their use for pytest, at least now).

I think something similar can be achieved using https://github.com/marketplace/actions/repository-dispatch -- but I need to experiment myself with two repositories, a "core" one and a plugin one. Expectation is that a plugin will always have a job that runs with master version of core, and that a PR on core would also trigger that build on the plugin (dependent repository), assuring that the core change does not break the plugin (even before is merged).

Shortly until we have a POC of this workflow it makes no sense to do anything to pytest as we do not want to pollute a mainstream project with experiments.

Scalability is another aspect to consider, still I am inlined to look only at solutions that can run on PR. Testing already merged code is helpful but is already too late if the breakage was introduced. The real value is when you prevent a breaking change from being merged.

There is also the DIY approach, which would be to create a downstream test launcher which would run plugin tests from a specific list of plugins. We could have a "tox-impact" kind of environment that runs pytest with a set of curated plugins. Even in this case, plugins would need to be able to expose their test suite, ideally tests being part of the package.

I doubt that plugin authors will complain too much if we would require them to do this, in the end is in their interest to assure that they play well with pytest.

nicoddemus commented 4 years ago

I also have thought about this issue in the past, but never really had the time to make it take off.

I like the idea given here of having a separate repository which runs pytest master against a curated list of plugins. Probably it makes sense to run one job per plugin, in a Python version of their choosing.

But I see other issues though, because some plugins might need to install system wide dependencies or other requirements which need to be done at the CI level (for example starting up a service).

There is also the DIY approach, which would be to create a downstream test launcher which would run plugin tests from a specific list of plugins.

This might be a good starting point. We might ask some plugin maintainers to "join" the effort by including their CI job in the overall GitHub action of this separate repository, and they can then contribute any changes to their CI setup if needed.

I think ideally we want to run against plugin master and pytest master: I see scenarios where we introduce a change which breaks a plugin, but makes more sense to fix the plugin (was an unsupported API for example). This gives a chance of the plugin author to fix their side and we see the results immediately.

Another issue is "flaky plugin" tests, we need this to be rock solid if we want to integrate this "plugin compatibility run" into pytest's workflow. Of course we can see how to deal with this when/if we came across it.

Another idea is have multiple workflow runs in .github/workflows, pytest's own repository, independently of pytest's own runs, one file for each plugin. Plugin authors can contribute their CI runs directly. Those jobs can run after the main workflow finishes, if such a thing is supported (a quick google search suggests it is not however).

pytest-dev / pytest

downstream testing to assert affect on major plugins #7342