nlsandler / writing-a-c-compiler-tests

Test cases for Writing a C Compiler
https://nostarch.com/writing-c-compiler
Other
172 stars 25 forks source link

Cleaner failure reports #62

Open zormit opened 3 months ago

zormit commented 3 months ago

This PR resolves #61.

zormit commented 3 months ago

I have tried the suggested way of overwriting TestTextResult like this (just a POC):

index 227be0a..4eb8d24 100644
--- a/test_framework/runner.py
+++ b/test_framework/runner.py
@@ -24,6 +24,16 @@ from test_framework.tacky.common import CHAPTER as TACKY_OPT_CHAPTER
 from test_framework.tacky.suite import Optimizations

+class MyTextTestResult(unittest.TextTestResult):
+    def addFailure(self, test, err):
+        if self.showAll:
+            self._write_status(test, "FAIL")
+        elif self.dots:
+            self.stream.write("F")
+            self.stream.flush()
+
+
 def get_optimization_flags(
     latest_chapter: int,
     optimization_opt: Optional[test_framework.tacky.suite.Optimizations],
@@ -522,7 +532,11 @@ def main() -> int:
     unittest.installHandler()

     # run it
-    runner = unittest.TextTestRunner(verbosity=args.verbose, failfast=args.failfast)
+    runner = unittest.TextTestRunner(
+        verbosity=args.verbose,
+        failfast=args.failfast,
+        resultclass=MyTextTestResult,
+    )
     result = runner.run(test_suite)
     if result.wasSuccessful():
         return 0

The problem is this, though:

Cannot access attribute "_write_status" for class "MyTextTestResult*" Attribute "_write_status" is unknown

and I didn't want to open up the whole thing... If you see an easier way of making this happen, let me know. The chosen method of disabling all stacktrace might be a bit of overkill. Maybe it also could work like this:

class MyTextTestResult(unittest.TextTestResult):
    def addFailure(self, test, err):
        old_tblim = sys.tracebacklimit
        sys.tracebacklimit = 0
        super(MyTextTestResult, self).addFailure(test, err)
        sys.tracebacklimit = old_tblim

but in my test that crashed loudly with

AttributeError: module 'sys' has no attribute 'tracebacklimit'

and left me confused.

nlsandler commented 3 months ago

Thanks for opening this PR! I'll take a look at the TextTestResult solution and see if there's an easy way to get that working. There are actually several other spots where we report the name of the source file on failure, including build_error_message, compile_success, common.build_msg and probably a few others. Would you be willing to update those too? I think you should be able to find any others by just grepping for self.assert.

(Some error messages don't include a filename at all and probably should, but that can wait for another PR.)

nlsandler commented 3 months ago

Your POC TextTestResult code seems to work for me - what Python version are you using?

zormit commented 3 months ago

Would you be willing to update those too?

yes, looking at this now.

Your POC TextTestResult code seems to work for me - what Python version are you using?

3.11.4 is the version it failed. I just also installed and tried 3.12.4, which failed similarly. Maybe your code looks subtly different than mine?

zormit commented 3 months ago

I added relative paths to:

I'm now not anymore entirely sure whether using relative paths (to TEST_DIR) is a good idea... The upside of an absolute path is that it's very portable. Another "portable" way would be to print the path relative to where it was called, then I can directly copy it to a tool as well without thinking too much where to find it.

Also, as written in the first sub-bullet, I haven't really tried all my changes. I'm not far enough with the content to test all the scenarios. Could still work, but please really double check my code 😅

nlsandler commented 3 months ago

Thanks for updating this!

====================================================================== FAIL: chapter_1/valid/multi_digit.c

AssertionError: Incorrect behavior in /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/multi_digit built from /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/multi_digit.c

====================================================================== FAIL: chapter_1/valid/return_2.c

AssertionError: Incorrect behavior in /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/return_2 built from /absolute/path/to/writing-a-c-compiler-tests/tests/chapter_1/valid/return_2.c


Ran 28 tests in 8.494s

FAILED (failures=3)


  We can do this by 1) setting each test's docstring to be the file's relative path:
```python
   index f6812e6..dc401fc 100644
--- a/test_framework/basic.py
+++ b/test_framework/basic.py
@@ -596,6 +596,8 @@ def make_invalid_test(program: Path) -> Callable[[TestChapter], None]:
     def test_invalid(self: TestChapter) -> None:
         self.compile_failure(program)

+    test_invalid.__doc__ = str(program.relative_to(TEST_DIR))
+
     return test_invalid

@@ -725,6 +727,7 @@ def make_valid_tests(
             else:
                 # for stages besides "run", just test that compilation succeeds
                 test_method = make_test_valid(program)
+            test_method.__doc__ = str(program.relative_to(TEST_DIR))
             tests.append((test_name, test_method))
     return tests

and 2) overriding MyTextTestResult.getDescription to just return the short description, which is the docstring (the default implementation returns test name + short description):

class MyTextTestResult(unittest.TextTestResult):

    def addFailure(self, test: Any, err: Any) -> None:
        super(MyTextTestResult, self).addFailure(test, (None, err[1], None))

    def getDescription(self, test: unittest.TestCase) -> str:
        return test.shortDescription() or "MISSING"
nlsandler commented 2 months ago

Thanks for updating this! I left some nitpicky comments (and we'll need to set docstrings for the Part III tests) but on the whole this is looking good! And not squashing commits yet is useful, thank you.

It would be helpful to have a way to re-enable stack traces. We could either add a new flag, or turn them on when the verbosity level is 3 or higher. (Increasing the verbosity past 2 doesn't impact the default test runner behavior, so people could still use -vv to get the normal verbose output.) Including that in this PR would be great, but I'm also fine with leaving it for a follow-up PR if you don't want to deal with it.