TriBITSPub / TriBITS

TriBITS: Tribal Build, Integrate, and Test System,
http://tribits.org
Other
36 stars 46 forks source link

cdash_analyze_and_report.py: Better deal with filtered-out nonpassing tests #305

Open bartlettroscoe opened 4 years ago

bartlettroscoe commented 4 years ago

In issue #301, I added the option --require-test-history-match-nonpassing-tests=off to allow extra nonpassing tests to be filtered out of the outer list of nonpassing tests but have the test history show a failing test for the current testing day. This was needed to filter our mass random system failures like we are seeing on 'vortex' showing Error: Remote JSM server is not responding on host vortex (see https://github.com/trilinos/Trilinos/issues/6861) . We can filter these out of the other query of nonpassing tests now but we can't currently filter these out of the inner queryTests.php query that grabs the test history. That creates a confusing entires in the 'twim' table like:

Tests with issue trackers Missing: twim=10

Site Build Name Test Name Status Details Consec­utive Missing Days Non-pass Last 30 Days Pass Last 30 Days Issue Tracker
vortex Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-2019.06.24_­static_­dbg MueLu_­Maxwell3D-Tpetra_­MPI_­4 Failed Completed (Failed) 0 12 7 #6882

That is confusing for even me to look at but it will also confuse automated tools like are being written in https://github.com/trilinos/Trilinos/issues/3887.

One idea to address this problem is to add the option --extra-exclude-test-filters="field1=<field1>&compare1=<compare1>&value1=<value1>&..." that will be tacked on the queryTests.php query fields for the global set of nonpassing tests as well as the test history queryTests.php query. However, the big disadvantage of this is that the test would now be listed as straight up missing in the 'twim' file. That would be even more confusing that the current situation where at least the status of these "missing" tests are shown as "Failed". I really don't want an automated tool to think that such tracked tests have been disabled or something, because they have not been. They have actually been run.

Another idea is to change the "Status" of these special tests from "Failed" to "Missing/Failed" so the entry looks like:

Tests with issue trackers Missing: twim=10

Site Build Name Test Name Status Details Consec­utive Missing Days Non-pass Last 30 Days Pass Last 30 Days Issue Tracker
vortex Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-2019.06.24_­static_­dbg MueLu_­Maxwell3D-Tpetra_­MPI_­4 Missing / Failed Completed (Failed) 0 12 7 #6882

That would really catch your eye and it would also allow automated tools like are getting written in https://github.com/trilinos/Trilinos/issues/3887 to handle these types of tests in a special way if needed.

bartlettroscoe commented 4 years ago

Note that one thing that might help this issue is when the expected_fail_regex field is added to the CSV file (and multiple criteria can be match the same test). We might consider adding special logic where if a tracked built failing test is missing in the outer query of nonpassing tests but is shown to be failing once the test history is extracted, then if the test output matches the failed test, then it can be categorized with that issue tracker instead and sorted to the bottom of the email under the table "Tests with issue trackers allowed to fail Failed: twiatff = ???". In that case, nothing gets listed for that test for that issue tracker. I think that would be the solution to this problem long term (once I can get around to implementing the expected_fail_regex and allow_to_fail fields in the CSV file).

Short term, I think we need to avoid some confusion and just change the status of these tests from "Failed" to "Missing / Failed" and then we can decide how to deal with these entries in the bot being implemented in https://github.com/trilinos/Trilinos/issues/3887.

bartlettroscoe commented 4 years ago

I pushed the commit 7a078b9 that changes the status of these special tracked tests from "Failed" to "Missing / Failed".

Putting this in review for now.

bartlettroscoe commented 4 years ago

The implementation in commit 7a078b9 being used in the TrilinosATDMStatus scripts has been working fairly well (at least for me). The one downside is that table showing the number of failing tests in the last X days shows a fails number of failures because know system failures are not being filtered out.

Therefore, I think the best situation would be to add the option --extra-exclude-test-filters="field1=<field1>&compare1=<compare1>&value1=<value1>&..." and use it in the global query of nonpassing tests but also use it for the test history link and queries. But it would be good to also get the test history without theses extra filter fields as well to show the the test was "Missing Failed" like the current implementation was doing. And it would be nice to have access to both of these test history queries. The problem is how to do that? Perhaps we could add the queryTests.php without the extra query fields to the '0' in the "Consecutive Missing Days" column? The fact that a tests is both "missing" and "failed" with '0' missing consecutive days, clicking on that '0' might be a good way to provide access to this information.

I really need to split off a new TriBITS Issue to implement --extra-exclude-test-filters in the way described in the paragraph above and close this issue!