Coverage statistics based on every source file

vch9 commented 2 years ago

If I understand correctly, the coverage (%) is based on instrumented and visited files during tests' executions. Lets say I have:

- src
+-- main.ml (10 lines)
+-- foo.ml (10 lines)
- test
+-- test_foo.ml

Where test_foo covers only the 10 lines in foo.ml. I would like to have:

$ bisect-ppx-report summary
> 50.0 %

Have you ever considered having this? This might be too complicated on the current state of bisect?

aantron commented 2 years ago

If I understand correctly, the coverage (%) is based on instrumented and visited files during tests' executions. Lets say I have:

Almost correct — the issue is specifically that main.ml is (probably) not linked into test_foo.exe, so it never has a chance to register itself and its points with the runtime, and thus they never get dumped.

This is a long-time limitation of Bisect_ppx. See https://github.com/aantron/bisect_ppx/issues/136 and https://github.com/aantron/bisect_ppx/issues/207.

A partial workaround that works for my relatively small projects is to use the --expect and --do-not-expect reporter options to list the source files I expect to be included in my report. These options can take a directory, list it, and complain if some of the files in that directory are missing. The user can then figure out why the files aren't being linked in, or --do-not-expect them if there is no reasonable way to link them.

       --expect=PATH
           Check that the files at PATH are included in the coverage report.
           This option can be given multiple times. If PATH ends with a path
           separator (slash), it is treated as a directory name. The reporter
           scans the directory recursively, and expects all files in the
           directory to appear in the report. If PATH does not end with a path
           separator, it is treated as the name of a single file, and the
           reporter expects that file to appear in the report. In both cases,
           files expected are limited to those with extensions .ml, .re, .mll,
           and .mly. When matching files, extensions are stripped, including
           nested .cppo extensions.

       --do-not-expect=PATH
           Excludes files from those specified with --expect. This option can
           be given multiple times. If PATH ends with a path separator
           (slash), it is treated as a directory name. All files found
           recursively in the directory are then not required to appear in the
           report. If PATH does not end with a path separator, it is treated
           as the name of a single file, and that file is not required to
           appear in the report.

aantron commented 2 years ago

This is a long-time limitation of Bisect_ppx.

To be a bit more precise, this might be a limiation of build systems, Bisect, or their integration. It's not clear how Bisect could know the set of source files that are present in the project in general, given the complexities of generated files, etc., and everything else the build system knows and Bisect doesn't.

arvidj commented 2 years ago

Slightly related: I recently wanted to compare the coverage of two traces (trace A and trace B) over the same source code. For the reasons discussed above, the traces did not contain the same set of points. To make the coverage statistics comparable I did the following:

Create a null trace 0 by merging A and B and setting the visited count to 0 at each point
Then I created a report for merge(0, A) and one report for merge(0, B)

This ensures that both reports have the same set of points.

This is a bit of a hack, but works well enough for my use-case. To create the null trace, I added an flag --null to the merge command. If you think this has wider application, I can make a PR.

aantron commented 2 years ago

Would the PR contain the --null flag?

I'm not aware of wider application of this (it might arise again as Bisect gets more and more very large project users). However, we can still merge it if you'd like to avoid maintaining a fork (unless you have to anyway, for other reasons).

arvidj commented 2 years ago

Would the PR contain the --null flag?

Yes, that's what I had in mind. Another option to the null flag to merge would be to have another option --points-from to html and summary such that you can do:

bisect-ppx-report html --points-from full.coverage subset.coverage

This would add all points from full.coverage but set their visited count to 0. Then, it would take the visited count from subset.coverage.

So in the example in my previous comment (with the reports A and B), one could write:

bisect-ppx-report merge AB.coverage A.coverage B.coverage
bisect-ppx-report html --points-from AB.coverage A.coverage -o _report_A
bisect-ppx-report html --points-from AB.coverage B.coverage -o _report_B

I don't think we need to have this right now in order to avoid maintaining a fork. I think I need to experiment more on our side to understand to which degree we want to have this feature.

aantron commented 1 year ago

I am going to close this issue for now, as it remains unclear how to best address it. So far, I am only planning to add warnings/hints about this in more places where Bisect displays messages to users and in the README (https://github.com/aantron/bisect_ppx/issues/408#issuecomment-1506754800).

aantron / bisect_ppx

Coverage statistics based on every source file #391