teemtee / tmt

Test Management Tool
MIT License
82 stars 123 forks source link

Brainstorm-grade: storing test waivers as test metadata #506

Open t184256 opened 3 years ago

t184256 commented 3 years ago

I'm looking into several approaches of how trivial test errors could be automatically waived by examining the test output.

1) test execution env-agnostic, where a wrapper is executed instead of the test, executes the test, captures output, processes it against some waiving rules, signals whether the output is expected or not (human attention is needed or not) using a return code. 2) env-integrated, where the test execution environment itself is modified to consult the waiving rules and classify the result into a richer, finer-grained set of statuses (+expected_fail, +unexpected_pass, +...), 3) post-processing, where an external tool is invoked to do post-processing of the stored output, apply waiving rulers and tweak the results accordingly

All three approaches, while being wildly different implementation-wise, have the following in common: 1) waiving rules have to be stored somewhere, unless hardcoded into a custom wrapper, and 2) waiving rules would vary significantly between SuT versions, and would be highly likely to be subject to relevancy.

That makes me think that might be beneficial to treat waiving rules as test metadata, store those in fmf and reuse adjust mechanism to pick the applicable ones between different executions.

I'm currently running a feasibility study of sorts using env-integrated approach, storing waivers alongside the tests, but not as test metadata yet. The aim at this stage is to explore how simple could we go with the rules semantics while still having a useful tool. If the whole ordeal turns out fruitful, I'll look into standardizing the format and, possibly, generalizing it into an env-agnostic approach.

All in all, I'll be raising this discussion in a more vocal manner if and when we decide that we want to buy into automated results waiving. Meanwhile, this ticket is a just forewarning that there might be demand to store test waiving rules in test metadata, and, if that'd be the case, those would be heavily 'adjust'ed. This, and a discussion prompt.

cc @tomato42, @ep69

thrix commented 3 years ago

@t184256 could we also note here the use cases for using the waivers, which should also explain what is meant by trivial test errors pls?

milosmalik commented 3 years ago

Timeline of our use-case: 1) bug XYZ is reported -> 2) automated test which covers bug XYZ is created -> 3) bug XYZ is fixed -> 4) bug XYZ is tested and verified

Very often, the following situation happens: the automated test gets executed before step 3. I would like to see some automatic waiver (which analyses the environment and the test results) and says that the failure is expected because the bug XYZ is not yet fixed (build with the fix does not exist yet) or the environment contains an older build. Such an automatic waiver would greatly improve the experience from various CI environments, where the situation happens on a daily basis. I guess that metadata necessary for the decision making are: bug XYZ number, version+release of the build installed, version+release of the build with fix

t184256 commented 3 years ago

@milosmalik, you've captured it well. Scenarios when it's worth executing the test, even though it fails, and then ensuring it fails in a consistent, well-known manner: bug not fixed yet, issue cause by a component outside your control, verifying that old trusty bugs don't go anywhere on old versions... =) Modifying the test itself to account for all of this quickly leads to spaghetti code, concealing the original flow. Analyzing the logs separates the analysis from the execution.

Though I'd prefer to limit the decision to component version, distro version, and other axis already present in fmf.Context. Accessing bugtrackers is too fragile for my tastes.

An example I have at hand: library tries to enables Intel CET support, but one of its dependencies doesn't. The test fails, outputting the following lines among the others:

:: [ 16:21:42 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.kztIj4Cp' should contain 'IBT'
...
:: [ 16:21:42 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.LF1Di5Fy' should contain 'SHSTK'
...
:: [ 16:21:42 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.kztIj4Cp' should contain 'IBT'
...
:: [ 16:21:42 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.LF1Di5Fy' should contain 'SHSTK'

but it's not the libraries fault and there's nothing we can or should do. The library is fine. The test is fine. IMO, this is a good candidate for solidifying the failure and explaining why it's acceptable, so that we don't spend our precious review time on it, with rules similar to these:

adjust:
    expect:
      - pattern: |
          :: [ 00:00:00 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.XXXXXXXX' should contain 'IBT'
        times: 2
      - pattern: |
          :: [ 00:00:00 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.XXXXXXXX' should contain 'SHSTK'
        times: 2
    when: distro ~< fedora-35
    because: Fedora <=35 doesn't have CET enabled yet

This way, the test execution system won't flag this test for our attention - we know about this situation - until something changes. When something changes (extra failures crop up, these ones disappear or we start running it on Fedora 35...) - it's re-review time.

thrix commented 3 years ago

Yeah, got it, thanks for sharing your use cases. It is basically expectedness like functionality. I like that it would be placed directly with the test case.

tomato42 commented 3 years ago
adjust:
    expect:
      - pattern: |
          :: [ 00:00:00 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.XXXXXXXX' should contain 'IBT'
        times: 2
      - pattern: |
          :: [ 00:00:00 ] :: [   FAIL   ] :: File '/var/tmp/rlRun_LOG.XXXXXXXX' should contain 'SHSTK'
        times: 2
    when: distro ~< fedora-35
    because: Fedora <=35 doesn't have CET enabled yet

I think that shell globbing (? for single character and * for multiple characters, with escapes from them using \? and \*) would be more user friendly than X and 0 (and I think regexes would be overkill and problematic from escaping PoV)