racket / rackunit

Other
18 stars 34 forks source link

Treat exceptions thrown by arguments to check as test failures #107

Closed vkz closed 4 years ago

vkz commented 5 years ago

Problem

Rackunit family of check macros, as they are implemented atm, exhibit an unfortunate behavior where they lose all context and report no location of failed checks when the code being tested raises exceptions. Picture a typical rkt source file with test-modules freely interspersed with code. In a check like this:

(check eq? (f) 42)

if f ever throws, the exception will be propagated to the programmer without much detail and worse without any indication as to which check (perhaps of hundreds) raised an error. Location is neither captured nor reported. Triaging such failures quickly turns into a bisecting expedition that may be a lot of work when your have many tests in the same file.

Here's the mechanics as I understand it. Rackunit checks are defined with define-check macro, which takes a bunch of arguments and a check body that constitutes the computation that decides whether the check succeeds or fails. The macro constructs a function that performs said computation with the body wrapped in additional context e.g. location and name of the check, so that installed handlers can communicate the failure to the programmer. That usually works well, however, since checks are really glorified functions they eval their arguments eagerly. As it happens in your typical check most of the user code runs before it ever reaches the body. During the evaluation of the arguments to the check function no context is collected, nor does the branch that handles such errors attempt to report anything beyond the error message (see display-test-failure/error in format.rkt).

This issue has been discussed at some length on the mailing list: https://groups.google.com/d/msg/racket-users/aCQwqCTY42U/NRZ_AU_YBwAJ

Proposed solution

Delay argument evaluation in checks by wrapping each in a thunk, then force the thunks in the body of a check so that any expression passed to the check runs within a context that can be reported to the programmer.

I believe this won't break your typical use case of rackunit and no user code would have to change. This may not be true of any library that builds on top of rackunit.

vkz commented 5 years ago

Example code:

(require rackunit)

(define (f)
  (raise-result-error 'f 42 'none-given))

(check eq? (f) 42)
(check eq? 41 (+ 41 1))

Error report:

--------------------
; ERROR
name:       check
location:   play.rkt:12:0
params:     '(check eq? (f) 42)

raise-result-error: contract violation
  expected: string?
  given: 42
  argument position: 2nd
  other arguments...:
   'f
   'none-given
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:14:0
name:       check
location:   play.rkt:14:0
params:     '(check eq? 41 (+ 41 1))
--------------------

note the location for the ERROR case.

bennn commented 5 years ago

This changes the "checks are functions" idea that the docs suggest.

Currently, the only difference between:

(check-equal? A B)

and

(let ((f check-equal?))
  (f A B))

is that the second one gives a different source location. With this PR, the second one is also going to lose the context.

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

p.s. #97

vkz commented 5 years ago

Hey Ben.

I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example?

;; check in identifier position
(let ((f check-equal?))
  (f 41 42)
  (f 1 2))

=>

--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name:       check-equal?
location:   play.rkt:18:9
actual:     41
expected:   42
--------------------
--------------------
; FAILURE
; /Users/russki/Code/fcgi-rkt/play.rkt:18:9
name:       check-equal?
location:   play.rkt:18:9
actual:     1
expected:   2
--------------------

with or without PR. Location there is pointing at the (f check-equal?) bit. I get the idea that you'd rather have it point at the use site, which is in the body of let, but it didn't even before this PR. In fact, unless I misunderstand the changes I made, the case with checks in macro id position (that's what allows you to "pass them as function values" should behave exactly as they used, that's because (a) I never touched that part of code and (b) macro id will have been expanded by the time the result is used, so the args remain unwrapped and the whole thing behaves exactly as it used to.

I can think of only one corner case where check macro as id may "misbehave" due to this PR. If you define a custom check that expects thunks and then does something with those thunks inside the body. IIUC, the following line from the PR would inadvertently force them - probably not something the user intended:

(set! formal (if (thunk? formal) (formal) formal))

I don't know how likely this corner case, probably not unlikely. Don't know. We can probably work around that by wrapping in a callable struct whose type we can identify, then check for that struct type rather than thunk. Easy fix, IMO.

I'm not insisting on the proposed solution, but OMG the current behavior is infuriating, to the point that it discourages me to write or rely on tests. This brought to you by me trying to figure out which test in the source file triggers an exception by essentially commenting things out, moving code and tests and code around every single time that happens. Racket isn't like Clojure, Elisp or CL in that I can't just willy-nilly eval tests one by one until I discover which'd failed. Even if I could why do something machine can do better?

Did I misunderstand your example above?

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

I don't think you can go more fine grained than (check eq? (f) value) and it would still exhibit the same problem. How would smaller tests help?

rfindler commented 5 years ago

I agree about the backward compatibility concerns here. But it would be nice to have some that do capture exceptions. Can we find new names?

Robby

On Sun, Apr 7, 2019 at 11:22 PM Ben Greenman notifications@github.com wrote:

This changes the "checks are functions" idea that the docs suggest.

Currently, the only difference between:

(check-equal? A B)

and

(let ((f check-equal?)) (f A B))

is that the second one gives a different source location. With this PR, the second one is also going to lose the context.

I'd rather we update the docs to talk about this gotcha with exceptions, and encourage people to write more & smaller tests.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/racket/rackunit/pull/107#issuecomment-480678298, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYWsDFS3Ewe5J1zeFvbgyiI98CLUlakks5vesP0gaJpZM4cg9cd .

mfelleisen commented 5 years ago

Our experience with checking for beginners (test-engine/) shows that we want to deal with exceptions properly here. The idea of check-*s as functions might have looked appealing at some point, but I fail to see any in it.

I think we should create rackunit2, have it import all of rackunit, and re-export macro-based checkers with the exact same names. Then we should deprecate rackunit eventually and point people to rackunit2.

rmculpepper commented 5 years ago

In rackunit, the check forms are meant to act like functions. The test forms are the things that wrap evaluation, catch errors and continue, etc. If you write

(test-equal? "anon" 1 (/ 0))

it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like test-equal?) instead.

One obstacle to using the test forms is the extra verbosity of giving every test a name. Here's a proposal: let's allow the identifier _ in the test name position, in which case the test macro synthesizes a name like "anonymous test at file:line".

vkz commented 5 years ago

it catches the error, prints a message, and continues. So I recommend leaving checks alone and using the test forms (like test-equal?) instead.

ATM the only indication you get from the (test-equal? "anon" 1 (/ 0)) example you suggested is the anon name - no location given. Definitely an improvement over check. With anonymous tests you'd absolutely want to synthesize that location else you're back to the problem we have with checks. But then why not do it always, anonymous or not? Of course "tests are functions" make it problematic.

TBH I get why people would hesitate about what I suggested. It solves a real problem, but then it is a hack and may break people upstream. Dunno, you guys make the call. I'll make do with my patch for now lest I go crazy.

bennn commented 5 years ago

Hey Ben.

I understand your concern, but tbh not seeing what you're seeing. Have you tried running your example?

f is not going to catch exceptions as well as check-equal?

AlexKnauth commented 5 years ago

Maybe there's a less extreme version of this pull request that still achieves your goals of catching errors thrown by arguments, but doesn't need to change make-check-func and change the behavior of checks applied to thunks.

Within the output of the define-check macro, the check-impl and make-check-func should stay the same, but the define-syntax name can still change to put the argument evaluation into a context that creates a test failure, before calling the check-impl function.

In other words it shouldn't be check-impl / make-check-func's job to catch argument-evaluation failures since those should be functions that deal with already-evaluated values. But it can be the define-syntax name's job.

jackfirth commented 5 years ago

I second @mfelleisen's proposal to abandon the notion that check forms are just functions. It places far too many restrictions on the check API, and these restrictions make it difficult to provide important quality-of-life features like this pull request and #73. I think rackunit2 is a good idea.

AlexKnauth commented 4 years ago

The issue this PR was written to solve has been fixed by https://github.com/racket/rackunit/pull/109, so I am closing this PR