Closed jhabriel closed 1 year ago
Thanks for the nice writeup, and for trying out task lists inside an issue.
Some thoughts:
It may also be wise to ask Ivar for opinions after the two of us are done discussing, but before implementing something.
Thanks for the input. Super valid questions, I will try to answer them below:
Of course. If you are satisfied with the answers, feel free to tag him on this issue to have his input.
@keileg, @OmarDuran and @IvarStefansson:
I would like to have your opinion on what constitutes a slow test, i.e., what would be the criterion to mark a test as slow based on the amount of time it takes to run. Please, leave your thumbs up in one the options from below.
Also, if you have a concrete suggestion for the bullet point number 4 of https://github.com/pmgbergen/porepy/issues/856#issuecomment-1505369611, please let me know.
Thanks in advance.
I have two comments: 1) I think run time forms only part of the basis for choosing which tests to skip. Their importance to the test suite is also relevant. The importance criterion includes which part of the code the test covers, the suite's coverage of that part of the code and possibly more. 2) I suggest a nuance toEirik's check box: It should not apply "trivial" PRs (only style, documentation, very basic maintanance etc.).
I agree. Importance should definitely be considered too. Meaning, if there is a very important test that qualifies as slow but is crucial for the test suite, I assume we will keep it. However, I see those as outliers rather than the norm.
Now, there might also be non-crucial fast tests that we might want to skip, as you say. My suggestion will therefore be to create a different marker for these types of tests, e.g., non_crucial
.
That being said, we still need a criterion to classify a test as slow. And if the non_crucial
marker is introduced, also a criterion to classify them as such.
I would hesitate to introduce too many labels for now. If we can get away with a single flag to indicate whether a test is run always or only on "full runs", that would probably be easier. Introducing more categories later is of course possible if the need arises.
I agree that introducing too many categories is a no-go. However, there are a few reasons why you would want to categorize skipped tests:
To your point, I think there should be strict rules for categorizing tests.
Solved via #863.
Background
Currently, tests are run on every push and at merging time through the Pytest workflow (https://github.com/pmgbergen/porepy/actions/workflows/run-pytest.yml). Naturally, slow tests (e.g, integration tests for three-dimensional models and convergence tests) are not part of this workflow because they take too much time. To be precise, they are skipped since they are explicitly marked with a decorator of the type
@pytest.mark.skip
or@pytest.mark.skipif
.The need for a workflow that runs all tests
At the moment, there is no workflow that checks if the skipped tests are actually passing. This means that the skipped tests are currently not in used, which is obviously suboptimal from a testing framework standpoint.
Proposed changes
Create a new workflow named Pytest-slow, that run all tests on a weekly basis (specifically, on Sundays at 00:00 GMT+2) and also manually (see https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow and https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch). The latter option is important, since new slow tests must pass before merging a PR. It will therefore be the responsibility of the assignee and the reviewers to verify that Pytest-slow completes without errors.
Marking tests
We need a system to effectively mark tests as slow (in the future, there might be a need for other types of classifications). There is no unique way to achieve this (in fact, there is at least three options, see: https://jwodder.github.io/kbits/posts/pytest-mark-off/). However, it seems that the best option is to create a custom mark (e.g., slow) and use the
pytest_collection_modifyitems
hook to add an additionalpytest.mark.skip
. This is done in aconftest.py
file.How would this work in practice?
Mark tests as slow with the decorator
@pytest.mark.slow
. When one runspytest
, all tests with the assigned decorator will be skipped. If one wants to run all tests, then is simply a matter of adding the--run-slow
flag, i.e.,pytest --run-slow
.Implementational steps
conftest.py
file following the recipe from https://jwodder.github.io/kbits/posts/pytest-mark-off/@pytest.mark.skipped