Reported args for failure case shows functions instead of values when using for-all

dparis commented 9 years ago

When a test fails using the test.chuck for-all macro, the failure report data lists functions as the failing arguments. Here's a simple repro case and the resulting failure report:

(test-check/quick-check 20
 (test-chuck-prop/for-all [x test-check-gen/pos-int] (< 0 x)))

{:result false
 :seed 1423884093065
 :failing-size 0
 :num-tests 1
 :fail [#<utils_test$eval18115$fn__18118$fn__18119 hunter.utils_test$eval18115$fn__18118$fn__18119@7ff7aa49>]
 :shrunk {:total-nodes-visited 0
          :depth 0 
          :result false
          :smallest (#<utils_test$eval18115$fn__18118$fn__18119 hunter.utils_test$eval18115$fn__18118$fn__18119@7ff7aa49>)}}

I looked over the macro expansion, but I wasn't familiar enough with the test.check internals to notice anything obviously wrong. Hopefully it's a quick fix!

gfredericks commented 9 years ago

oh this is interesting; obviously the current behavior is unhelpful but it's not obvious what the correct behavior is here -- it seems weird to me to just use a tuple like c.t.c.p/for-all does, since multiple clauses are no longer peers -- i.e., for

(test-chuck-prop/for-all [count gen/pos-int, xs (gen/vector gen/pos-int count)] ...)

it would strike me as strange to see :fail [3 [1 2 3]]; and it would get even weirder if you start using :parallel.

But anyhow an alternative idea that I'm also not sure about is to figure out exactly which symbols are bound in all the expressions and generate a map based on that. So in the above example you'd end up with:

:fail [{count 3, xs [1 2 3]}]

If you can't think of any reason that's terrible or a clearly better idea, I'd be okay pushing that out just on the merits of it being strictly more useful than what it does currently.

gfredericks commented 9 years ago

This commit has code that does what I'm proposing.

dparis commented 9 years ago

Yeah, I think that's a sensible approach, or at least something like it would be. Only concern I might have, if full compatibility with test.check is desirable, would be tools which expect the tuple output of :fail and :shrunk. Outputting a map in place of the bare values would seem to collide ambiguously with cases where the first generated value bound is also a map. Maybe if the output looked like something like this:

[{::for-bindings true  ; namespaced keyword shows this is a bindings map rather than the first bound value
  count 3
  xs [1 2 3]
  :let {let-binding-a "qwerty"
        let-binding-b "asdf"}}]

That would allow tools to check (if they cared) to see if the first value was really a generated map, or if it was a test.chuck for-bindings map. Also, I think it would make sense to then put let-bound values into their own map.

gfredericks commented 9 years ago

This is an interesting point; but even if we have tools processing things like this, will it make much of a difference if the tool knows for sure what the value represents? The programmer should be looking at it either way, and will know how to interpret, right?

The idea of a :let key is kind of cool though and could be used in either case.

dparis commented 9 years ago

I was thinking about, for instance, if I wanted to make a lein task that my CI server could run which looks at the test output and generates some kind of report artifact on any failure cases. Some tests might use the test.check for-all, some might use the enhanced version in test.chuck. It'd be nice to have a single function to process the value at :fail and :shrunk, and let's say my test used a map and a keyword as generated values. The normal test.check for-all might output:

:fail [{:k1 "v1", :k2 "v2"} :k3]

The test.chuck for-all would output:

:fail [{::for-bindings true, m {:k1 "v1", :k2 "v2"}, v :k3}]

Without the ::for-bindings key in the latter case, my reporting function wouldn't know to treat all of the key/val pairs in the map as the set of bindings, rather than just the generated value in a 1-tuple.

A minor point, perhaps, but it would smooth integration of the two for-all variants. Certainly, in any case, even just the output you've already got in the new commit is way more useful. :smile:

gfredericks commented 9 years ago

Oh so what about if I stuck a flag in the metadata so they're distinguishable if you want but aren't cluttering up the output when you don't?

(with-meta {'count 3, 'xs [1 2 3]} {::for-bindings true})

dparis commented 9 years ago

That'd be great!

gfredericks commented 9 years ago

Fixed in 0.1.14.

gfredericks / test.chuck

Reported args for failure case shows functions instead of values when using for-all #5