Cleaner failure reports

zormit commented 3 months ago

This PR resolves #61.

disables traceback when not set to verbose
makes path to file relative (on compile_failure)

zormit commented 3 months ago

I have tried the suggested way of overwriting TestTextResult like this (just a POC):

index 227be0a..4eb8d24 100644
--- a/test_framework/runner.py
+++ b/test_framework/runner.py
@@ -24,6 +24,16 @@ from test_framework.tacky.common import CHAPTER as TACKY_OPT_CHAPTER
 from test_framework.tacky.suite import Optimizations

+class MyTextTestResult(unittest.TextTestResult):
+    def addFailure(self, test, err):
+        if self.showAll:
+            self._write_status(test, "FAIL")
+        elif self.dots:
+            self.stream.write("F")
+            self.stream.flush()
+
+
 def get_optimization_flags(
     latest_chapter: int,
     optimization_opt: Optional[test_framework.tacky.suite.Optimizations],
@@ -522,7 +532,11 @@ def main() -> int:
     unittest.installHandler()

     # run it
-    runner = unittest.TextTestRunner(verbosity=args.verbose, failfast=args.failfast)
+    runner = unittest.TextTestRunner(
+        verbosity=args.verbose,
+        failfast=args.failfast,
+        resultclass=MyTextTestResult,
+    )
     result = runner.run(test_suite)
     if result.wasSuccessful():
         return 0

The problem is this, though:

Cannot access attribute "_write_status" for class "MyTextTestResult*" Attribute "_write_status" is unknown

and I didn't want to open up the whole thing... If you see an easier way of making this happen, let me know. The chosen method of disabling all stacktrace might be a bit of overkill. Maybe it also could work like this:

class MyTextTestResult(unittest.TextTestResult):
    def addFailure(self, test, err):
        old_tblim = sys.tracebacklimit
        sys.tracebacklimit = 0
        super(MyTextTestResult, self).addFailure(test, err)
        sys.tracebacklimit = old_tblim

but in my test that crashed loudly with

AttributeError: module 'sys' has no attribute 'tracebacklimit'

and left me confused.

nlsandler commented 3 months ago

Thanks for opening this PR! I'll take a look at the TextTestResult solution and see if there's an easy way to get that working. There are actually several other spots where we report the name of the source file on failure, including build_error_message, compile_success, common.build_msg and probably a few others. Would you be willing to update those too? I think you should be able to find any others by just grepping for self.assert.

(Some error messages don't include a filename at all and probably should, but that can wait for another PR.)

nlsandler commented 3 months ago

Your POC TextTestResult code seems to work for me - what Python version are you using?

zormit commented 3 months ago

Would you be willing to update those too?

yes, looking at this now.

Your POC TextTestResult code seems to work for me - what Python version are you using?

3.11.4 is the version it failed. I just also installed and tried 3.12.4, which failed similarly. Maybe your code looks subtly different than mine?

zormit commented 3 months ago

I added relative paths to:

build_error_message
- Here I wasn't really sure how it should look like. I now added the relative source file, and left the absolute exe_name. I currently don't have the ability to try this out and see how I like it -- maybe it's better to turn exe_name into a relative one instead.
compile_success
common.build_msg
validate_no_output
library_test_helper
copy_prop.retval_test

I'm now not anymore entirely sure whether using relative paths (to TEST_DIR) is a good idea... The upside of an absolute path is that it's very portable. Another "portable" way would be to print the path relative to where it was called, then I can directly copy it to a tool as well without thinking too much where to find it.

Also, as written in the first sub-bullet, I haven't really tried all my changes. I'm not far enough with the content to test all the scenarios. Could still work, but please really double check my code 😅

nlsandler commented 3 months ago

Thanks for updating this!

I was able to reproduce your issue with MyTextTestResult in Python 3.9.13 (the default Python version on my Mac). It looks like _write_status was added in a later Python version. I'm not sure why you're seeing this issue in more recent versions (it sounds like you might somehow being using the standard library from the system Python...or something?) but the bottom line is we need to either avoid _write_status or provide a fallback for Python versions that don't support it. This seems to work in all versions, though I'd need to test it further to be sure:
```
class MyTextTestResult(unittest.TextTestResult):

def addFailure(self, test: Any, err: Any) -> None:
    super(MyTextTestResult, self).addFailure(test, (err[0], err[1], None))
```
The err argument is a tuple (type, value, traceback), so I've just replaced the traceback with None. Want to update the PR to give that a try, or should we deal with tracebacks in a separate PR?

I see what you mean about full paths being more portable. What do you think of using the relative path as a test name/description, and displaying the absolute path in the error message, like this:


======================================================================
FAIL: chapter_1/invalid_lex/return_2.c
----------------------------------------------------------------------
AssertionError: Didn't catch error in /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/invalid_lex/return_2.c

====================================================================== FAIL: chapter_1/valid/multi_digit.c

AssertionError: Incorrect behavior in /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/multi_digit built from /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/multi_digit.c

Bad return code: expected 100 and got 0

====================================================================== FAIL: chapter_1/valid/return_2.c

AssertionError: Incorrect behavior in /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/return_2 built from /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/return_2.c

Bad return code: expected 2 and got 0

Ran 28 tests in 8.494s

FAILED (failures=3)


  We can do this by 1) setting each test's docstring to be the file's relative path:
```python
   index f6812e6..dc401fc 100644
--- a/test_framework/basic.py
+++ b/test_framework/basic.py
@@ -596,6 +596,8 @@ def make_invalid_test(program: Path) -> Callable[[TestChapter], None]:
     def test_invalid(self: TestChapter) -> None:
         self.compile_failure(program)

+    test_invalid.__doc__ = str(program.relative_to(TEST_DIR))
+
     return test_invalid

@@ -725,6 +727,7 @@ def make_valid_tests(
             else:
                 # for stages besides "run", just test that compilation succeeds
                 test_method = make_test_valid(program)
+            test_method.__doc__ = str(program.relative_to(TEST_DIR))
             tests.append((test_name, test_method))
     return tests

and 2) overriding MyTextTestResult.getDescription to just return the short description, which is the docstring (the default implementation returns test name + short description):

class MyTextTestResult(unittest.TextTestResult):

    def addFailure(self, test: Any, err: Any) -> None:
        super(MyTextTestResult, self).addFailure(test, (None, err[1], None))

    def getDescription(self, test: unittest.TestCase) -> str:
        return test.shortDescription() or "MISSING"

In build_error_message, I think you're right to include the names of both the exe and source file. If we're giving the full path for source files in other tests (e.g. using my idea from the previous bullet point) I'd use them here too. We can just use a relative path (or even just the name without a path) for the executable - it gets deleted during teardown anyway so there's no point in including a full path to it.

Side note - it might be nice to have an option to keep the executable for failed tests, like we have an option to keep the assembly file - in which case printing the full path to that executable would make sense. But let's save that for a follow-up PR.

nlsandler commented 2 months ago

Thanks for updating this! I left some nitpicky comments (and we'll need to set docstrings for the Part III tests) but on the whole this is looking good! And not squashing commits yet is useful, thank you.

It would be helpful to have a way to re-enable stack traces. We could either add a new flag, or turn them on when the verbosity level is 3 or higher. (Increasing the verbosity past 2 doesn't impact the default test runner behavior, so people could still use -vv to get the normal verbose output.) Including that in this PR would be great, but I'm also fine with leaving it for a follow-up PR if you don't want to deal with it.

nlsandler / writing-a-c-compiler-tests

Cleaner failure reports #62

====================================================================== FAIL: chapter_1/valid/multi_digit.c

====================================================================== FAIL: chapter_1/valid/return_2.c