Closed vch9 closed 2 years ago
It comes from:
val line_counts :
filename:string -> points:int list -> counts:int array -> int option list
(** Computes the visited lines for [~filename]. For each line, returns either:
- [None], if there is no point on the line.
- [Some count], where [count] is the number of visits to the least-visited
point on the line. The count may be zero.
- [Some count], where [count] is the number of visits to the least-visited point on the line. The count may be zero.
I believe it would make sense to duplicate this function by changing to:
- [Some count], where [count] is the sum of visits to each points on the line. The count may be zero.
The comment you added on this function makes me wonder if what I called a bug
is ok for you point of view:
This function is "lossy," as OCaml code often has multiple points on one line. However, this is a necessary conversion for line-based coverage report formats, such as Coveralls and Cobertura.
I thought it's better to take the min rather than the sum, as it gives a conservative picture of the coverage rather than an optimistic one.
The real underlying issue is that coverage for an expression-based grammar for OCaml doesn't translate well to line coverage, which is mostly useful for statement-based grammars. In my own projects, I never really relied on Coveralls, Codecov, or other coverage hosting services that use line coverage. I just used them to get an overall approximate picture, and for having their bots post comments, but for precise coverage I always used the actual Bisect HTML report.
Maybe there is reason for having a hosted service that can show expression coverage for FP. I've thought about that several times, too.
If I did rely on line coverage, I'd probably write my code so that lines tended to have (EDIT: at most) one point each.
Are you sure you want the sum here rather than the min? It will create a misleading line coverage report in this and many other cases.
hits
for line4
should be the sum of hits for every point in the line.
Specifically I don't quite see the argument for this. The issue is that the Cobertura format assumes that each line has only one point. That's often not true for OCaml code, and I don't immediately see how it makes sense to collapse all the points into effectively one point when they have independent visitation counts.
To see one way this is absurd, note that if a program visits the line
| n -> if n = 2 then "2" else "n"
only once, it will enter the match case once, and then go to one of the branches once. If you sum those counts, you get 2 visits. But there was only 1 visit, just it first visited one point and then continued by visiting another. It doesn't make sense to report that as two visits to the line — there really was only one visit.
Of course, in this example, the program did not visit one of the branches, so the line coverage will be reported as zero by the current logic anyway. However, the above is at least an argument for using max rather than sum, for an optimistic line coverage report.
In light of the above, maybe it makes sense to add some kind of --optimistic
flag to Coveralls and Cobertura reports, that will replace min by max when calculating visits of lines.
The issue is that the Cobertura format assumes that each line has only one point. That's often not true for OCaml code, and I don't immediately see how it makes sense to collapse all the points into effectively one point when they have independent visitation counts.
Yes, you're absolutely right. The issue I have with the "least visited point" is you can have line=X hits=0
when you know it has been hit at least once, and it has confused users in our project already.
it will enter the match case once, and then go to one of the branches once. If you sum those counts, you get 2 visits. But there was only 1 visit, just it first visited one point and then continued by visiting another. It doesn't make sense to report that as two visits to the line — there really was only one visit.
Yes, I believe this totally block the sum
I am proposing as it will get even more confusing for users. People will not know that the match case (or ifs) is a point in the report.
However, the above is at least an argument for using max rather than sum, for an optimistic line coverage report.
Yes max
should be more suitable. However, you pretty much convinced me that the least visited point is a better idea. The best answer I can give to "confused" users of the cobertura report would be "use the html report" (which we also provide).
Closing this as I agree with your point, and I would rather use the current state instead of the --optimistic
flag. Re-open if you think we need this flag and I'll implement it.
Thanks! I don't think we need it. We can leave this closed until/unless future or experience or someone else can make a case for it :)
We noticed a strange behavior in the Cobertura report (made by myself obviously).
( test ) let () = assert (Foo.Main.to_string 0 = "0"); assert (Foo.Main.to_string 5 = "n"); print_string "Test done"
Line
4
hashit = 0
when it has been hit.hits
for line4
should be the sum of hits for every point in the line.