Untested C++ files don't impact code coverage

tsr-boxbot commented 1 year ago

Description of the bug:

When I run bazel coverage I expect to see 0% for files that aren't hooked into any tests; however when I run genhtml on the bazel-out/_coverage/_coverage_report.dat instead I see that those files are totally omitted.

This seems like a bug to me. Say we're in a code repository with thousands of files. Is a user expected to audit the coverage data to verify a particular file is included in the list?

I expect when a user adds a file without adding tests the total amount of lines in the coverage report to increase, and the amount of hit lines to not increase. Thus impacting the percentage (by lowering it) and indicating to the user that they need to add tests.

Which category does this issue belong to?

C++/Objective-C Rules, CLI

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I've created an extremely small version of the project with a simplistic cpp file and test to demonstrate the issue, you can find that code in: https://github.com/tsr-boxbot/bazel_code_coverage_problems/tree/42a40294a0f0d24a5efa88b649da69f422b66d5c . A key thing to note here is we are using clang-13 via a toolchain spec

$ bazel-6.3.2 coverage //... --combined_report=lcov -c dbg --config=clang_config --instrument_test_targets --instrumentation_filter='^//'
INFO: Build option --instrumentation_filter has changed, discarding analysis cache.
INFO: Analyzed 7 targets (47 packages loaded, 649 targets configured).
INFO: Found 6 targets and 1 test target...
INFO: LCOV coverage report is located at /home/tyler/.cache/bazel/_bazel_tyler/af64a4260e07ac56294cb50354aadc2f/execroot/test/bazel-out/_coverage/_coverage_report.dat
 and execpath is bazel-out/_coverage/_coverage_report.dat
INFO: From Coverage report generation:
Aug 28, 2023 4:46:25 PM com.google.devtools.coverageoutputgenerator.Main getTracefiles
INFO: Found 1 tracefiles.
Aug 28, 2023 4:46:25 PM com.google.devtools.coverageoutputgenerator.Main parseFilesSequentially
INFO: Parsing file bazel-out/k8-dbg/testlogs/add_test/coverage.dat
Aug 28, 2023 4:46:25 PM com.google.devtools.coverageoutputgenerator.Main getGcovInfoFiles
INFO: No gcov info file found.
Aug 28, 2023 4:46:25 PM com.google.devtools.coverageoutputgenerator.Main getGcovJsonInfoFiles
INFO: No gcov json file found.
Aug 28, 2023 4:46:25 PM com.google.devtools.coverageoutputgenerator.Main getProfdataFileOrNull
INFO: No .profdata file found.
INFO: Elapsed time: 4.618s, Critical Path: 4.05s
INFO: 23 processes: 3 internal, 20 linux-sandbox.
INFO: Build completed successfully, 23 total actions
//:add_test                                                              PASSED in 0.3s
  /home/tyler/.cache/bazel/_bazel_tyler/af64a4260e07ac56294cb50354aadc2f/execroot/test/bazel-out/k8-dbg/testlogs/add_test/coverage.dat

Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
$ genhtml bazel-out/_coverage/_coverage_report.dat
Reading data file bazel-out/_coverage/_coverage_report.dat
Resolved relative source file path "add.cpp" with CWD to "/home/tyler/git/code_coverage_test/2/bazel_code_coverage_problems/add.cpp".
Found 2 entries.
Found common filename prefix "/home/tyler/git/code_coverage_test/2"
Writing .css and .png files.
Generating output.
Processing file bazel_code_coverage_problems/add.test.cpp
Processing file bazel_code_coverage_problems/add.cpp
Writing directory view page.
Overall coverage rate:
  lines......: 100.0% (6 of 6 lines)
  functions..: 100.0% (7 of 7 functions)

My expectation is I should lower than 100% coverage for both lines and functions because mul.cpp's lines and function(s) are not tested.

Which operating system are you running Bazel on?

Ubuntu 22.04

What is the output of `bazel info release`?

release 6.3.2

If `bazel info release` returns `development version` or `(@non-git)`, tell us how you built Bazel.

N/A

What's the output of `git remote get-url origin; git rev-parse master; git rev-parse HEAD` ?

git@github.com:tsr-boxbot/bazel_code_coverage_problems.git
42a40294a0f0d24a5efa88b649da69f422b66d5c
42a40294a0f0d24a5efa88b649da69f422b66d5c

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

Unsure

Have you found anything relevant by searching the web?

I found several unanswered stack overflow questions about this same issue https://stackoverflow.com/questions/72399451/how-to-include-all-targets-in-bazel-coverage https://stackoverflow.com/questions/75692853/bazel-coverage-skips-source-files-with-no-tests-in-coverage-dat-report https://stackoverflow.com/questions/74309049/bazel-coverage-results-are-incomplete possibly related but not definite dupes of my question: https://stackoverflow.com/questions/46447218/using-bazel-to-generate-coverage-report https://stackoverflow.com/questions/46371795/how-to-merge-coverage-from-multiple-dat-files-in-bazel

Any other information, logs, or outputs that you want to share?

I could very well be using --instrumentation_filter incorrectly. My expectation is to use a filter that catches all non-external code in the repository that is apart of a cc_library or cc_binary. If I am using --instrumentation_filter incorrectly then I will file a documentation issue separately. Some of the stackoverflow questions above hint that instrumentation_filter is the answer but so far I have been unable to solve my problem with it.

c-mita commented 1 year ago

From the POV of Bazel, coverage collection is a special case of a test execution. That means that during a coverage run, Bazel is only really aware of the dependencies of the test in question.

"Baseline coverage" is the feature where coverage reports are generated for files as though they were never executed. Bazel has very limited support for this. (Technically you can do bazel build //foo:foo_main --collect_code_coverage and Bazel will generate "baseline_coverage.dat" files for every target in the dependency graph of foo_main, but these will not contain any line data, limiting their usefulness.)

The lack of baseline coverage is mentioned here, although I haven't got a concrete proposal for how to implement it yet (the technical side I think should be easy, it's mostly a question of UI): https://github.com/bazelbuild/bazel/discussions/19144

I could very well be using --instrumentation_filter incorrectly. My expectation is to use a filter that catches all non-external code in the repository that is apart of a cc_library or cc_binary. If I am using --instrumentation_filter incorrectly then I will file a documentation issue separately. Some of the stackoverflow questions above hint that instrumentation_filter is the answer but so far I have been unable to solve my problem with it.

The instrumentation filter is simply a filter on targets within a build that should be built with code coverage enabled. Because your added mul target is not a dependency of add_test, it won't ever be included.

tsr-boxbot commented 1 year ago

@c-mita thanks for the reply! I shot a comment over in the discussion thread voicing my observation.

Regarding:

(Technically you can do bazel build //foo:foo_main --collect_code_coverage and Bazel will generate "baseline_coverage.dat" files for every target in the dependency graph of foo_main, but these will not contain any line data, limiting their usefulness.)

Is it possible to take all not-so-useful baseline_coverage.dat files and give them on the commandline with the full coverage file? I don't know if genhtml is smart enough to figure it out. I'll give this a shot myself and report back

tsr-boxbot commented 1 year ago

I tried bazel-6.3.2 build -c dbg --config=clang_config --collect_code_coverage //... in my code base; however it did not output empty baseline_coverage.dat files

c-mita commented 1 year ago

They should be under bazel-testlogs. There should be one baseline_coverage.dat file for each top-level target built.

$ bazel build --collect_code_coverage //foobar:foo //foobar:bar
...
$ tree bazel-testlogs
bazel-testlogs
`-- foobar
    `-- foo
        `-- baseline_coverage.dat
    `--bar
        `-- baseline_coverage.dat

The content of foobar/foo/baseline_coverage.dat should be:

SF:[file_name]
end_of_record

For every file_name in instrumented_files. i.e. everything from InstrumentedFilesInfo in your transitive dependencies that's matched by the instrumentation filter.

I believe the logic to setup a default value for --instrumentation_filter doesn't run when you do bazel build, so by default it will match everything.

bazelbuild / bazel