Closed dorranh closed 2 years ago
Thanks for re-opening @aantron. I used a poor choice of phrasing in our unrelated PR :sweat_smile:
Yep, caught that :P The issue got closed right as I was getting to work on it :D
This behavior of Bisect is as expected. The unvisited points you are seeing are the out-edges of the application expressions, visited by normal return of the functions, and, of course, these functions never return. You can suppress these points (somewhat imprecisely) by annotating the functions inside the applications with [@coverage off]
:
(failwith_error_code [@coverage off]) error_ZeroInput
This doesn't happen with failwith
because Bisect_ppx automatically suppresses instrumentation of the out-edge of failwith
applications, as a convenience, so that people don't have to insert [@coverage off]
themselves:
If you use functions like this pervasively, and don't want to insert [@coverage off]
at every call site, it may be possible to add some command-line option that can give Bisect additional identifiers to suppress out-edge instrumentation for.
@aantron thank you for the explanation, it's been really helpful in understanding the issue!
I spent more time digging into this and I noticed that if-then-else
and match-with
get colored quite differently:
If-then-else | Match-with |
---|---|
Since failwith
never returns, I understand why we don't mark it green when it's on the right-hand side of a branch (whether that is an if-then-else
branch or a match-with
branch). However, with match-with
we can clearly see which branches/clauses are matched in the tests, which is just as good (if we know that the pattern is matched, we also know that the right-hand side has been executed):
Match-with, with pragmas, with failing tests | Match-with, with pragmas, without failing tests |
---|---|
With if-then-else
it's unclear whether we ever reached the else
-clause, because else
is never colored. In fact, in this minimal example, switching the if-then-else
to a match-with
gives the exact same coverage percentage (77.78%). However, in our own codebase changing from if-then-else
to match-with
in a function changes coverage from 100% to 96.97% (correctly, these branches are indeed not covered by our tests):
If-then-else (reported 100%) | Match-with (reported 96.97%) |
---|---|
This is a bug, right?
I spent more time digging into this and I noticed that
if-then-else
andmatch-with
get colored quite differently:
In the first pair of images, I don't see the difference, besides that for match
, there is a pattern to place the mark on, so Bisect puts the marks on patterns, while in the if
-else
, it has to put the mark on the first character of subexpressions.
I'm not sure what the middle two images are comparing (probably due to not being familiar with the tests).
In the last pair of images, in the if
case, there should be a mark on line 269 on the Ligo.failwith
, but there isn't. I wonder if that's some sort of an interaction between if
not having a pattern to put a mark on, and special treatment of failwith
(though I would have expected the Ligo.
qualifier to matter). That does indeed look like a bug. Is that what you are referring to?
In the first pair of images, I don't see the difference, besides that for
match
, there is a pattern to place the mark on, so Bisect puts the marks on patterns, while in theif
-else
, it has to put the mark on the first character of subexpressions.I'm not sure what the middle two images are comparing (probably due to not being familiar with the tests).
Sorry, I should have been more clear; indeed that was the point I was trying to make. As far as I can tell, with match-with
, marks are placed on both the patterns and the right-hand sides, which gives pretty accurate results even if we have [@coverage off]
on the failwith
s. That's more or less what I was trying to point out with the "Match-with, with pragmas, without failing tests" figure.
With if-then-else
the markers are placed on the expression (there are no patterns, but I guess we could mark the "else" itself?), which means that we get no marking if the expression in the else is a failwith
.
In the last pair of images, in the
if
case, there should be a mark on line 269 on theLigo.failwith
, but there isn't. I wonder if that's some sort of an interaction betweenif
not having a pattern to put a mark on, and special treatment offailwith
(though I would have expected theLigo.
qualifier to matter). That does indeed look like a bug. Is that what you are referring to?
Exactly! My expectation is that the last two figures should have the exact same coverage percentage reported.
PS. Sorry for the images not referring to a minimal example, it's because I don't have one yet, but I will try to craft one.
@aantron I think I finally have a minimal example :tada: Apparently the issue does not have to do with whether we use failwith
directly or not, but it looks like a weird interaction between if-then-else
and the expressions in the branches having a type signature on them (I really didn't see that coming).
So, example.ml
contains the following:
let half_positive_integer (x: int) =
let foobar =
if x > 0
then
x / 2
else
(* (failwith "some error") (* without type sig *) *)
(* (failwith "some error" : int) (* with type sig *) *)
in
foobar
and the testsuite (tests.ml
) covers only the if-then
part:
open OUnit2
open Example
let suite =
"Tests" >::: [
"Y" >:: (fun _ ->
assert_equal
50
(half_positive_integer 101)
);
]
let () = run_test_tt_main suite
Depending on whether the expression the else branch has a type signature on it or not, we get different results:
without signature (correct: 66.67% reported) | with signature (wrong: 100.00% reported) |
---|---|
Let me know if that helps!
Let me know if that helps!
Yes! Sorry about the delay. This is very helpful!
I made an even smaller case
let () =
if true then
(() : unit)
else
()
...and found that this is due to Bisect's interaction with an apparent AST bug in the compiler, which I asked about in https://github.com/ocaml/ocaml/issues/10554. I suggest to see what that discussion yields. Bisect might need a workaround for the bug. Alternatively, if the bug will be fixed and you are satisfied with having this be slightly broken until the next OCaml release, we can just await that.
Let's keep this issue open to track this on the Bisect side, and in case I am wrong about this being a problem in OCaml.
Thanks for following up on this!
...and found that this is due to Bisect's interaction with an apparent AST bug in the compiler, which I asked about in ocaml/ocaml#10554. I suggest to see what that discussion yields. Bisect might need a workaround for the bug. Alternatively, if the bug will be fixed and you are satisfied with having this be slightly broken until the next OCaml release, we can just await that.
Let's keep this issue open to track this on the Bisect side, and in case I am wrong about this being a problem in OCaml.
Yeah, I agree. For the project we're working on we'd definitely like to hit 100% test coverage eventually, but in the meantime there's other stuff to work on. Let's wait and see :slightly_smiling_face:
It looks like the upstream OCaml fix has been merged and will be included in the next OCaml releases (5.00 and 4.14, as I understand it). Closing this issue. Thanks for reporting!
Hi @aantron, we've been using
bisect_ppx
2.6.1 for an OCaml project and wanted to get your feedback on a behavior we've observed.In our codebase, we define a helper function that internally calls
Stdlib.failwith
which we use to throw exceptions. When calling this function within the scope of a function definition, the coverage report generated bybisect_ppx
correctly shows the call counts for this function and reports the line as covered.However, when calling the same function from within a let binding,
bisect_ppx
reports that the function is never called. We've prepared a minimal example here, most of which I've pasted below. Is there a way to getbisect_ppx
to report coverage in this situation? Replacingfailwith_error_code
with direct calls tofailwith
in the below example marks these clauses green.Thanks in advance for your help!