DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Tests should not log tracebacks for expected exceptions #1568

Open jessebrennan opened 4 years ago

natanlao commented 4 years ago

@jessebrennan Would it be sufficient to pass the -b flag to unittest in the Makefile, such that stdout and stderr is buffered for all tests and only printed in the event of a failure (i.e., passing tests are altogether silent)? The only consequence of this I can anticipate is that this approach would also suppress warnings, but given that unexpected warnings should result in a test failure (if I am reading azul_test_case.py correctly), I don't see that as a problem.

jessebrennan commented 4 years ago

This is a good idea, but I wouldn't feel qualified to decide, since this issue is based on @hannes-ucsc's request.

natanlao commented 4 years ago

Conversation during standup: the proposed solution is okay, but has the caveat that it would become harder to debug tests that time out and are killed by the test runner (and not unittest). Looking for another solution, but if I can't find one, -b might be fine.

natanlao commented 4 years ago

@hannes-ucsc After digging into this, I've found two main options:

  1. Use unittest's -b flag to buffer all output from tests, and only print test output in the event of a failure. There are two immediate caveats with this approach:
    • unittest accomplishes this buffering by swapping out stdout and stderr, and some configuration to the tests will be necessary to ensure that log messages are passed through the new streams. There's precedent for this.
    • As you mentioned, this approach could obscure output from tests that time out.
  2. Identify tests that log expected exceptions and individually suppress exceptions, potentially with a context manager.
    • Depending on how this approach is executed, there's a chance that useful information could be suppressed.
    • This approach would require manual implementation in new or revised test cases.

I believe that the first option is the best option. It is, at least seemingly, less complex to implement and enforce.

In the first option, output buffering could be made configurable, so it could be removed in a drop! commit in a PR having problems. In this case, debugging information would only be lost for tests that fail stochastically due to timeout. (I can think of a couple of ways to mitigate that last issue -- conditionally exempting some tests from output buffering, or manually setting a timeout on certain tests, etc.)

hannes-ucsc commented 4 years ago

I don't want to buffer test output. It may be easier to implement but that doesn't outweigh hassle of having to switch it on and off with drop commits when tests hang or when I want to follow along on Gitlab or Github (which I do all the time to get an idea of the timing). When child processes write to stdout or stderr, their output is not buffered and appears out of context in case of a failure. Last but not least -b doesn't work with PyCharm which many of us use extensively to run and debug tests. I'd prefer not to see trace backs for expected exceptions when running tests in PyCharm.

I don't quite follow the complexity argument. Complexity isn't proportional to effort. It may be more tedious to track down every expected exception but that doesn't make that solution it inherently complex or complicated. Did you attempt to implement the context manager option so we can actually compare the complexity of these approaches? If not, I'd like you to go ahead and do that.

natanlao commented 3 years ago

Look at how self.assertLog works