ElektraInitiative / libelektra

Elektra serves as a universal and secure framework to access configuration settings in a global, hierarchical key database.
https://www.libelektra.org
BSD 3-Clause "New" or "Revised" License
208 stars 123 forks source link

run_all --rerun-failed #4523

Closed kodebach closed 11 months ago

kodebach commented 2 years ago

Our run_all script for make run_all (scripts/dev/run_all.in) includes an additional ctest $ARGS --rerun-failed at the end.

This was introduced as a workaround for #2439. However, this also introduces some issues. For example, the end of the output of make run_all will look something like this:

0% tests passed, 4 tests failed out of 4

Label Time Summary:
bindings    =   1.63 sec*proc (6 tests)
kdbtests    =  31.84 sec*proc (22 tests)
memleak     =  31.11 sec*proc (21 tests)

Total Test time (real) =  30.37 sec

The following tests FAILED:
          2 - testtool_backend (Failed)
         20 - testlib_notification (Failed)
         34 - testjna_gradle (Failed)
         38 - test_kdb.py (Failed)

The "4 tests faile out of 4" can be quite confusing, if you don't know about the --rerun-failed. To find the actual ratio of failed tests, you need to scroll up all the away beyond the output of the second run, which might be longer than your terminal's scroll buffer.

Another problem is that this can actually mask the real cause of a problem, if you only look at the output of the second run. That's because sometimes fail, because I previous (also failed test) didn't clean up properly. If you then do a rerun you might not see the true errors, because even the test that originally didn't clean up properly, may now fail differently.

Finally, the most annoying problem IMO is that the rerun means it takes quite a bit longer for make run_all to finish if more than one or two tests fail.

I propose we do one of 2 things:

  1. Remove ctest $ARGS --rerun-failed entirely. In #2439 the speculation was that the actual cause of the flaky tests was server overload. This may no be better and/or we might have a way to limit the load.
  2. Add an AUTO_RERUN env-var flag to run_all that is set in the CI. Only when the env-var is set, will we do ctest $ARGS --rerun-failed.
markus2330 commented 2 years ago

First: I don't think that FLOSS participants can do anything useful here, we need someone working on this as bachelor thesis.

Second: I don't see how this AUTO_RERUN should help? Is it about that you want to be able to trigger jobs without the --rerun-failed manually? Please always say what you expect/want first. There is a good reason why we have the issue template.

kodebach commented 2 years ago

There is a good reason why we have the issue template.

It is clearly aimed at bug reports. If you want more structure in other issues, we should create more templates. GitHub then lets you choose the template. See https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository

Please always say what you expect/want first.

I thought, it was clear that I don't want the second output (which I called confusing) to happend. And I expect the tests to just run once.

I don't think that FLOSS participants can do anything useful here, we need someone working on this as bachelor thesis.

This was not necessarily intended for FLOSS. Also not everything must be handled as part of a thesis or some course. This is simply something I noticed and wanted to bring up (that's what issues on FLOSS projects are for after all). During the last days on #4187, this was so annoying that I temporarily modified the run_all script to remove the second run.

However, I think the issue is simple enough for FLOSS. You just need to modify the run_all script to look for AUTO_RERUN and the Jenkinsfile and other CI config files to set AUTO_RERUN.

I don't see how this AUTO_RERUN should help? Is it about that you want to be able to trigger jobs without the --rerun-failed manually?

The purpose of AUTO_RERUN is that when you run make run_all locally, all tests are just executed once. You don't waste time waiting for the second run, and you see the actual ratio of tests that failed, without scrolling up past all the failed tests.

In the CI nothing would change. AUTO_RERUN would be set and we would still rerun the failed tests to mask flaky tests.

github-actions[bot] commented 1 year ago

I mark this stale as it did not have any activity for one year. I'll close it in two weeks if no further activity occurs. If you want it to be alive again, ping by writing a message here or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:

github-actions[bot] commented 11 months ago

I closed this now because it has been inactive for more than one year. If I closed it by mistake, please do not hesitate to reopen it or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart: