fediverse-devnet / feditest-tests-fediverse

The tests for the fediverse testsuite
MIT License
5 stars 4 forks source link

Test failures vs errors (WebFinger) #59

Closed steve-bate closed 2 weeks ago

steve-bate commented 1 month ago

Many test frameworks distinguish between test failure (code-under-test ran but didn't satisfy "pass" criteria) and errors, where the test framework itself (including node drivers, etc.) has a bug that triggers an exception.

Currently, feditest raises AssertionError for failed assertions. This is clearly a failure. However, the current WebFinger client seems to do internal validation of content type and HTTP status code (any maybe other checks?) and raises exceptions that are not an AssertionError.

I've been adding code to filter AssertionError (test failure) stack traces when not running with verbose logging. However, there's no way to know if these other exceptions represent test failures or errors. We've discussed that we want to log test framework-related errors even without verbose logging.

There are several ways we could address this issue. As examples... We could not throw the exceptions from the node driver and let the test decide if the response is valid or not. We could throw AssertionError from the framework/node for the failure conditions. We could catch the current exceptions in the test itself and then raise an AssertionError.

jernst commented 1 month ago

I suggest we do https://github.com/fediverse-devnet/feditest/issues/83 and then revisit this.

jernst commented 1 month ago

Now we have this distinction:

Does this cover it?

steve-bate commented 1 month ago
  • Python's AssertionError is for programming errors in the test scripts. Both python's assert and Hamcrest's assert_that raise this AssertionError so Hamcrest's assert_that is now a bit pointless.

Python tests in libraries like unittest and pytest raise AssertionError for test failures rather than programming errors. The pytest library also uses assert (with code rewriting to get better failure message). I assume that's why the Hamcrest library raises that exception from assert_that.

This is a somewhat different usage/convention for assert than for languages like C or C++.

I'd think that any raised AssertionError exception should be considered a test failure and anything else would be an programming error (yes, the exception name isn't ideal, but it is what it is, and probably was using Python's "Error" suffix rather than being name specific to unit testing uses).

That said, since this is a custom test framework, you can use whatever alternate convention you want. However, it seems like it would force a whole custom test assertion implementation if one isn't using AssertionError. It may also be a bit counterintuitive for experienced Python programmers.

Another option is to derive a FeditestTestFailure from AssertionError for cases where you'll throw the exception explicitly (or define your own test assertion framework). Something like (notional, only)...

enum FailureSeverity(StrEnum):
  HARD = "hard"
  SOFT = "soft"

class FeditestTestFailure(AssertionError):
  def __init__(self, *args, severity: FailureSeverity =FailureSeverity.HARD, degraded: bool = False, **kwargs):
     super().__init__(*args, **kwargs)
     self._severity = severity
     self._degraded = degraded

  @property
  def severity(self) -> FailureSeverity:
    return self._severity

  # . . . etc.
jernst commented 1 month ago

Seems either approach can produce the same result? If so, does it matter?

steve-bate commented 1 month ago

Seems either approach can produce the same result? If so, does it matter?

It could matter in a limited way since it's unconventional in a Python context. It may make it harder for some devs to understand the implementation. Also, unless we are careful to always catch and rethrow a different exception, most (practically all?) programming errors will not be AssertionError subtypes.

jernst commented 1 month ago

Other exceptions are already caught and reported similar to AssertionErrors. Screenshot 2024-05-25 at 08 59 16

I rate the remainder as a "nice to have" :-) Let's revisit this when we have nothing else to do or somebody indeed gets majorly confused.

jernst commented 2 weeks ago

I'm going to claim that https://github.com/fediverse-devnet/feditest/issues/166 subsumes this. If not, please file new issue.