Collect and unify pytest results across different runs

akihironitta commented 2 years ago

🚀 Feature

Create a workflow that collects and merges pytest results from CI runs across different operating systems, acclerators and software versions.

With this feature implemented, we will be able to see a merged list of all test cases succeeded/failed/skipped across all CI configurations.

Motivation

We have tests running on across different OS, accelerators and software versions, and currently, each CI run has its own result only, which making it almost impossible to monitor which tests are running or skipped across all such configurations.

Due to this, we've experienced an issue a while ago due to the lack of observability where all of the horovod tests had never run for a long time of period in PL repo.

Pitch

To be explored. (I guess we could somehow utilise https://github.com/pytest-dev/pytest-reportlog)

Alternatives

To be explored.

Additional context

Codecov automatically merges coverage results uploaded from different CI runs: https://app.codecov.io/gh/Lightning-AI/lightning/ AFAIK, cov result doesn't hold any pytest results, so need to find another way to collect each test case status from different CI settings.

Open for any suggestions 💜

akihironitta commented 2 years ago

This will enable us to easily see that skipped tests in a certain CI run (e.g. GPU CI) are covered by other CI configurations. context: https://github.com/Lightning-AI/lightning/pull/13651

akihironitta commented 1 year ago

Here's another case where sklearn-related tests have been skipped silently: https://github.com/Lightning-AI/lightning/pull/15311

I'll check and modify https://github.com/akihironitta/playground/pull/8 again to see if it's reliable enough to use in our CI.

Lightning-AI / utilities