Implement support for LLVMs code coverage instrumentation

gnzlbg commented 8 years ago

There are ways to more or less easily obtain code coverage information from rust binaries, some of which are in widespread use (e.g. travis-cargo + gcov + coveralls.io). However, these are either platform specific (gcov/kcov are linux only), or incomplete (coverage for documentation tests is not collected).

It would be better if rustc would be able to instrument rust binaries and tests using LLVM Coverage Instrumentation. This would allow code coverage information to work portably across the supported platforms, as well as produce reliable coverage information.

Ideally, such support would be integrated in an easy to use cargo coverage subcommand and the Rust Language Server, such that IDEs can use this information to guide the user while writing tests.

alexcrichton commented 8 years ago

I know I'd love to see this work! Do you know what it would entail in terms of what the hoops a prototype would have to jump through? Also, does this work reliably on platforms like Windows? Or is it still best on Linux and "works mostly elsewhere"?

gnzlbg commented 8 years ago

A prototype would need to generate the LLVM IR for the instrumentation, for that it needs to generate a mapping between source ranges and performance counters (I would start with just functions). Then it needs to generate and embed the IR in the object files. It would be a good idea to look how clang then outputs this into a file to use the same format (which is documented), and test that we can read it with llvm-cov in linux and macos (don't know about windows support but since clang is gaining full windows support if its not there it will be there soon).

I think that would be enough for a prototype, from there we can move on to generating instrumentation for conditionals (match/loop/jumps/ifs...) and blocks (how many iterations was a loop body executed). We could go down to expressions, but then it would be wise to offer users a way to control how much instrumentation is generated: functions, branches, and loops (which are branches) is typically enough. We should then support skipped regions (for conditional compilation), expansions (for macros, maybe plugins), and dealing with unreachable code (there is a performance counter that is always zero for that).

The meat of the work is in the source=> performance counter mapping, generating the IR, and generating the conforming output. llvm-cov works at least on mac and linux, but IIRC clang has options to generate coverage information even for particular gcov versions. The source code for all this is available so taking a look wouldn't hurt.

Also, does this work reliably on platforms like Windows? Or is it still best on Linux and "works mostly elsewhere"?

It works on Mac and Linux, don't know about Windows. Even if it doesn't work everywhere, this is probably the way to make it work in as much platforms as possible anyways. Clang support on windows is pretty good already and it is only getting better.

alexcrichton commented 8 years ago

@sujayakar's done some awesome work to prototype this on OSX at least, discovering:

For now, codegen-units probably has to be one to make this work (worth investigating)
The pass to insert into LLVM is -C passes=insert-gcov-profiling
The runtime support is available through -C link-args=-lclang_rt.profile_osx, and that library
This runtime support lives around the directory xcrun --find clang in a path like usr/lib/clang/7.3.0/lib/darwin/libclang_rt.profile_osx.a.
Binaries run with that instrumentation will produce a .gcda file
Next to xcrun --find clang, there's an llvm-cov tool, and that can be run over the output file to generate information.

That should at least get code coverage working on OSX in some respect! THis is also a pretty strong case that it may not be too hard to actually get this working for all platforms if LLVM's standard tool suite "just works".

sanxiyn commented 8 years ago

@alexcrichton That's gcov coverage, which is completely different from what @gnzlbg meant in this issue.

The correct pass name is instrprof, implemented in lib/Transform/Instrumentation/InstrProfiling.cpp. See also LDC LLVM profiling instrumentation.

gnzlbg commented 8 years ago

@sanxiyn is correct.

I just want to add that I don't think we should implement clang-like gcov coverage generation.

LLVM format can be used for many more things (e.g. profile guided optimizations), and LLVM's llvm-cov tool can generate gcov files from it.

frewsxcv commented 7 years ago

afaik, there's a PR open for this: https://github.com/rust-lang/rust/pull/38608

sanxiyn commented 7 years ago

As far as I can tell, #38608 is still gcov coverage and not instrprof.

alexcrichton commented 7 years ago

I believe this is now implemented as -Z profile and tracked at https://github.com/rust-lang/rust/issues/42524, so closing in favor of that.

sanxiyn commented 7 years ago

This is not implemented yet. LLVM coverage is different from gcov coverage, which is also different from sanitizer coverage. LLVM implements three separate coverage mechanisms.

mehcode commented 7 years ago

Would LLVM coverage ( instrprof ) enable usable code coverage of generic heavy code? None of the existing code coverage solutions work well with heavy use of generics.

My project (a gameboy emulator) makes very heavy usage of generics and it'd be wonderful to get code coverage working on integration tests but I'm not really sure how.

gnzlbg commented 7 years ago

I think good code coverage support is the single most important piece of tooling missing in Rust.

When I run cargo test, and everything is green, I get a good feeling, but this feeling actually means nothing at all. To make cargo test mean something, it would need to at least tell me how many code-paths of my application have been exercised.

This information does not mean much either (just because a code-path was exercised does not mean that it was exercised for all inputs), but it would mean more than nothing, and it would make writing tests and asserting the value of test way easier.

IMO adding this kind of capability to cargo test (and yes, it should be part of cargo test, people should always know their code coverage) would be extremely valuable.

There is only another part of tooling infrastructure that would come close to this in value, and that would be an undefined behavior detector for all rust code.

This might sound like a rant, but I just want to raise awareness that this is a very important issue at least for me (and from the other issues being filled, for others as well) because it directly affects the correctness of rust programs (we don't want them to only be memory safe, but also to have correct logic).

Maybe if rustfmt, clippy, and the RLS advance enough during the impl period, the Rust tool team could prioritize these two tools next?

If I don't have precise auto-completion information or my source code is not perfectly formatted, well, I can live with that. But if my program has undefined behavior and I am not catching it because I am only testing 40% of my code-paths then... I am screwed. Does this make sense?

mssun commented 6 years ago

This is not implemented yet. LLVM coverage is different from gcov coverage, which is also different from sanitizer coverage. LLVM implements three separate coverage mechanisms.

Hi, can anyone explain that what's the difference between "LLVM coverage" (profile-instr-generate) and "gcov" (insert-gcov-profiling)? I don't know the detailed implementations of these tow passes, but will profile-instr-generate be more accurate than insert-gcov-profiling.

mssun commented 6 years ago

I found @kennytm's answer to my question. Let me put here in case someone like me gets lost in the overwhelmed references: https://github.com/rust-lang/rust/pull/44673#issuecomment-330435323

bossmc commented 5 years ago

I've done a bit of prototyping on this to see what would be involved in using instrprof for coverage, and it's revealed (to me) that the coverage aspect of this feature is secondary to the "Profile Guided Optimization" feature that instrprof was originally written for.

So, here's a vague "plan of action" (with progress so far) to enable PGO and LLVM-assisted coverage:

[x] Add frontend-guided instrumentation to IR
- Unstable
[x] Enhance the profile-rt library to dump this out
[x] Enable Profile-Guided Optimization passes
- Unstable
[ ] Create a coverage map
[ ] Integrate LLVM coverage tooling into cargo

Add Frontend-Guided Instrumentation

This is currently implemented in its most basic form as -Zpgo-gen which adds the -pgo-instr-gen and instrprof passes to LLVM to add instrumentation of all functions and basic blocks (loop bodies, if statements etc.).

There may be other places that it's useful to add instrumentation (e.g. adding vtable-downcast counts to enable -pgo-icall-prom passes - which turn dynamic dispatch into static dispatch in fast paths).

:+1:

Enhance the profile-rt library

With the above work complete, the generated binaries will count (in-memory) each time an instrumented code path is hit. To extract this information, the profile-rt library should walk the various __llvm_pfr_X sections of the binary/memory and dump the values out to a file (Clang dumps to default.profraw). The format of this file is (as far as I can see) "documented" in LLVM/Clang's codebase. On the other hand, the current implementation of Rustc's profile-rt library is taken wholesale from LLVM and thus already supports this function.

:+1:

Enable profile-guided optimization

With a default.profraw, one can create a profdata file (using llvm-profdata merge). This file can be passed to the pgo-instr-use (and other) passes to enable optimisations based on gathered profile information. Again, this is supported in nightly Rustc today (through -Zpgo-use).

:+1:

Create a coverage map

LLVM's coverage tooling can use a "Coverage Map" to map between instrumentation counters and blocks of code. The format of the map is well documented at http://llvm.org/docs/CoverageMappingFormat.html and there's code in LLVM (http://llvm.org/doxygen/classllvm_1_1coverage_1_1CoverageMappingWriter.html) for creating the magic format. Note that this may require Rustc manually inserting instrumentation (rather than using -pgo-instr-gen) since it needs to know which counter corresponds to which code block. The code in clang (https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CodeGenPGO.cpp) that does this is probably a good place to start.

Note that the LLVM coverage map format supports various useful features, not well handled in GCOV-style coverage:

Column as well as line-number analysis (useful for inline expressions)
Macro-expansion analysis (coverage can "see through" macros)

❌

Integrate coverage tooling into Cargo

Creating a cargo coverage (or, possibly better, just creating coverage from cargo test) by adding profiling to the test cargo-profile and running the resulting profile data through llvm's tooling would be really cool. Especially if the output format is compatible with tools like Coveralls.io and can produce useful graphical outputs for analysis afterwards.

Likely requires stabilising -Zpgo-gen or at least exposing enough of it that cargo can create a profiled build.

:x:

Unanswered Questions

There are some bits of Rust code for which coverage is non-obvious:

Do landing pads report coverage? If they do then we run into the same issue as https://github.com/rust-lang/rust/issues/55352
- Maybe we add instrumentation for them (to determine how often they're hit) but don't add those counters to the coverage map?
How do we handle non-instantiated generics? If a generic is never instantiated then no code is created for it so there's no counters for it.
- Presumably we need to ensure that the coverage map includes the counters that "would be there if there was an instantiation" then coverage will spot that the generic is not covered.
- This applies to "default implementations" in traits and likely is impacted by specialisation too.
Similarly, should separate instantiations of the same generic be covered separately? If I test Foo<i32> in my UTs but use Foo<String> in my real code, have I really tested my code (in this case, String needs drop so maybe there's a subtlety there)?
- Create coverage map entries for all instantiations (as well as the "generic" counter entry from the previous bullet)? I don't think llvm-cov knows how to handle this in the coverage map at the moment, but it might make sense for it to do so to support this for C++ templates too?

bossmc commented 5 years ago

Proc-macros (and the built-in #derives) need care too - the expanded code coverage is potentially uninteresting (if you trust the proc-macro) or critical (if you don't).
Basically anywhere where rustc does "codegen" above and beyond what the user wrote needs to be thought about quite carefully.

pJunger commented 5 years ago

* How do we handle non-instantiated generics?  If a generic is never instantiated then no code is created for it so there's no counters for it.

I think rust should behave the same as clang here. And if I remember correctly, then not instantiated templates are reported.

* Similarly, should separate instantiations of the same generic be covered separately?  If I test `Foo<i32>` in my UTs but use `Foo<String>` in my real code, have I really tested my code (in this case, String needs `drop` so maybe there's a subtlety there)?

llvm-cov at least has the option to show coverage per instantiation (or merged - per -show-instantiations). So I think it makes sense to provide this as well.

eggyal commented 5 years ago

Great write-up @bossmc! I'll take a stab at adding LLVM coverage map generation, unless it's already being worked on elsewhere?

Whilst I'm happy to try and figure out a solution on my own, a little direction from the compiler team would no doubt ensure that I arrive at something that's acceptable far sooner! Would anyone be willing to mentor this?

Cc @michaelwoerister

michaelwoerister commented 5 years ago

I don't have time to mentor this but feel free to ask specific questions!

eggyal commented 5 years ago

@bossmc Are you sure that -Zpgo-gen (now stabilised as -C profile-generate) adds LLVM's -pgo-instr-gen and -instrprof passes? My understanding is that it "merely" instruments via gcov, which is the very issue under discussion here? Have I missed something?

michaelwoerister commented 5 years ago

-C profile-generate is exactly the same as Clang's -fprofile-generate.

eggyal commented 5 years ago

Thanks @michaelwoerister, that's what I thought—the difference being, AIUI, that -fprofile-generate instruments the IR whereas -fprofile-instr-generate (which this issue seeks) effectively instruments the AST (via source mapping).

michaelwoerister commented 5 years ago

Maybe debuginfo can help there?

eggyal commented 5 years ago

Unfortunately, for the reasons @kennytm gave in the comment to which @mssun linked above, debuginfo is not sufficient.

bossmc commented 5 years ago

@eggyal yes, profile-generate enables two passes:

pgo-instr-gen which heuristically adds http://llvm.org/docs/LangRef.html#llvm-instrprof-increment-intrinsic (and related intrinsics) to the LLVM-IR basically by adding them to the beginning of codeblocks.
instrprof which lowers the above intrinsics to "real code" I assume these are intrinsics so the implementation of profile counting can be smart (threadsafe in threaded programs, using native types where needed etc.)

Together (with the profile-rt library) these allow generation and subsequent use of profiling data during compilation. Unfortunately together these two passes are insufficient for code-coverage usage, since the LLVM is too far away from the AST. Part 4 of my write-up discusses writing the coverage mapping code that maps AST nodes to instrumentation intrinsics - this is what then enables frontend-guided coverage.

I suspect that adding the coverage map will require disabling the pgo-instr-gen pass and adding the intrinsics manually (this seems to be what clang calls profile-instr-generate?), because there's rust-specific knowledge that can/should be applied (e.g. generic expansions, semanticly equivalent AST changes etc.) and because there's no way to guess the profiling entries that will be generated by pgo-instr-gen.

bossmc commented 5 years ago

https://github.com/rust-lang/rust/pull/48346/files#diff-2030f049f8df2454e3720ea9403bc14aR446 is where we started adding the pgo-instr-gen pass.

bob-wilson commented 5 years ago

I am very eager to see this feature work for Rust. I was the person who designed and originally implemented Clang's profile-instr-generate feature (for both PGO and code coverage). I'm currently working on a project that is implemented in Rust and I miss the ability to get good code coverage data. I'm not yet familiar with the internal workings of the Rust compiler, I'd be happy to consult with anyone who is working on it. I can start by confirming that to get the best results, the instrumentation should be added in the Rust frontend by walking the AST, prior to any internal lowering so that the AST corresponds closely to the source constructs. Adding instrumentation in an LLVM pass can work OK for PGO but it's not what you want for code coverage.

eggyal commented 5 years ago

Thank you both @bossmc and @bob-wilson — all very helpful! I've spent some time over the past week getting to grips with how instrprof is implemented in Clang, and have begun prototyping some broad structures of a Rust implementation.

I had pretty much come to exactly the conclusion you mention, @bob-wilson: in order to closely align coverage with the source, instrumentation will need to be added before the AST is lowered (or at least, before HIR is lowered to MIR—this seems slightly better to me as macros will then have been expanded). However, in order to maintain backend-independence, this presumably means that new instrumentation intrinsics will need to be introduced into the MIR (and possibly the HIR)?

As an intermediate step before working on that goal, I am currently thinking that I will first instrument the MIR on lowering to LLVM IR—and thereby prove that I have both instrumentation and coverage map generation working at that level first. But perhaps that would be unnecessarily arduous and I would be best to start with the AST/HIR?

Otherwise I do think that I have a reasonable feel right now for my next steps, but am more than happy to work on this with anyone else who'd like to contribute! Also very grateful to those who've offered their input, especially since this is my first foray into the internals of rustc.

bob-wilson commented 5 years ago

I think the instrumentation and coverage map should be generated from the ASTs after macro expansion and before lowering to HIR. Adding the instrumentation is relatively easier than generating the coverage mapping, so I wouldn't bother doing that on MIR first -- you would probably end up throwing most of it away. I do agree with you that it would be best to start with the instrumentation, and adding instrumentation intrinsics to HIR and MIR seems like the right thing to do.

eggyal commented 5 years ago

Another thing that I've been pondering is exactly what should be excluded from the coverage map? Clang excludes system headers, but perhaps we should exclude all dependencies from outside the package/workspace being built?

Can anyone confirm the semantics of syntax::source_map::SourceFile::is_imported(&self) -> bool?

bossmc commented 5 years ago

I think (though I suspect @bob-wilson will know better and can feel free to correct me) that the coverage format has the capability to represent "macros" (remote code) sensibly (https://llvm.org/docs/CoverageMappingFormat.html#file-id) so maybe we'd want to apply instrumentation directly to the parse tree?

eggyal commented 5 years ago

@bossmc: Aye, expansion regions are perfect for macros—and they can be fully generated from the Span info in the HIR, whereas the parse tree would not have the expanded code to reference (thus requiring subsequent modification of any found expansion regions to complete the referencing).

bossmc commented 5 years ago

Ah yes, I'm being slow :) - so long as the expanded code knows it came from a macro then HIR is the correct point indeed.

bob-wilson commented 5 years ago

Based on the RustC guide, I had the impression that HIR has a number of lowering changes that diverge from a 1-1 mapping to the source code. Trying to map back to the exact source locations for the original constructs seems like it may be difficult. The RustC guide also describes macro expansion as happening prior to lowering to HIR. Is that out of date? It would definitely be best to insert the instrumentation and build the coverage mapping at a point where macros have been expanded, but I was hoping that could be done before other lowering transformations.

eggyal commented 5 years ago

Based on the RustC guide, I had the impression that HIR has a number of lowering changes that diverge from a 1-1 mapping to the source code. Trying to map back to the exact source locations for the original constructs seems like it may be difficult. The RustC guide also describes macro expansion as happening prior to lowering to HIR. Is that out of date? It would definitely be best to insert the instrumentation and build the coverage mapping at a point where macros have been expanded, but I was hoping that could be done before other lowering transformations.

Perhaps one of the seasoned rustc developers could chip in on this, as I'm not entirely certain—but after digging through the source for a bit it seems to me that MIR is perfect for instrumentation and source-mapping: I suspect that the coverage regions for a given function are exactly the MIR Statement descendants of its Body that are StatementKind::Assign and their source locations can readily be obtained from their Span data (yielding either a code region or an expansion region); skipped regions might even be identifiable by Statement descendants that are StatementKind::Nop—although perhaps some such regions will not have any presence whatsoever in the MIR?... which then might only leave gap regions (that I confess I don't yet entirely understand).

pJunger commented 5 years ago

which then might only leave gap regions (that I confess I don't yet entirely understand)

~~I understood it to be for empty statement, e.g. if (condition) {/*gap*/} ?~~ Edit: see below.

At least that's what I'm gathering from CoverageMapping.h:

/// A GapRegion is like a CodeRegion, but its count is only set as the /// line execution count when its the only region in the line.

bob-wilson commented 5 years ago

Although you may very well be able to implement some kind of code coverage support at the MIR level, that would be a different approach than what we did in Clang. One key idea from Clang's implementation is that both the instrumentation and the coverage map are directly tied to the source constructs. As one example, take an "if" statement. The instrumentation is done by inserting a counter at the start of the "then" block. If there is an "else" block, there's no need to put a counter there because you can calculate the execution count from the parent count and the "then" block count. This also holds true in the coverage map, where the goal is to precisely map the execution counts back to source locations. So why should this be done before any lowering? As an example, take one thing that I saw in the RustC Guide: "if let" is lowered to HIR as a "match". A match statement can have any number of arms, so it generally needs to be instrumented differently than an "if" statement. When lowering from an "if let", presumably there is only one arm (or two, if there was an "else") but even if there is a way to recognize that, it would need to be a special case and I suspect it would get quite complicated. Could you just go ahead and instrument it as a "match"? Sure, but then you're likely to run into challenges mapping back to the exact source locations. I can give an example of a gap region that is somewhat relevant here. Say that you put a blank line in between then "if let" condition and the start of the "then" block. What execution count should be displayed for that blank line? Clang treats this as a gap region from the end of the condition to the start of the "then", and it chooses to use the "then" block count for that region. I haven't given it a lot of thought, but I think the answer may be different for a "match" statement. It would be strange for that blank line to show the execution count of the first arm of the match.

eggyal commented 5 years ago

That's really helpful, thanks @bob-wilson.

I'm definitely not averse to injecting the instrumentation earlier (it was what I had earlier thought would be needed), but I'm still not completely convinced that MIR lacks the necessary information to produce the desired result. Taking the example you mention, of if being lowered to a match, the generated MIR TerminatorKind::SwitchInt will have the relevant number of .values / .targets—and a counter expression can be used in place of an outright counter if appropriate, irrespective of whether the original source was an if or a match. So far from being a special case, it actually enables match expressions with fewer arms to be handled with counter expressions in exactly the same way as an if. Nor do I think there's any issue mapping back to source locations, as the containing Terminator structure has .source_info that contains the generating Span.

bob-wilson commented 5 years ago

Part of my comment with that example is that I suspect you might want different behavior for the gap region after the "if let" condition than for the gap region after a match condition with 1 arm, and you may not be able to distinguish them easily after lowering.

The other missing part of this discussion relates to using these profiles for PGO in the context where someone wants to save a profile as part of their project (e.g., in a case where collecting a good profile is labor intensive). We had a design goal that the profile should remain usable in that scenario even if you had updated your compiler version so that the internal lowering changed. We did that by tying the profile structure to the source code structure as reflected in the pre-lowering ASTs. I don't know if Rust will ever support that style of instrumentation-based PGO, but I think it is still a valuable feature for some styles of development and it would make sense to me to implement the coverage profiling in a way that would align with that in case someone wants to support it in the future.

More generally, I think you can probably follow Clang's lead in many aspects of this work. It's an approach that has been tried and works well in practice. If there's a compelling reason to do it differently in Rust, then it might make sense, but otherwise, my advice is to follow Clang's approach.

pJunger commented 5 years ago

Is there even a need to leave out the counter on the else block? Isn't that one of the reasons that Clangs coverage does not work reliably when exceptions are thrown?

eggyal commented 5 years ago

@pJunger: It's not "leaving out" the counter, but rather providing means for deducing its value with zero runtime overhead.

@bob-wilson: I'm not sure I'm following the thinking re gap region differences following if let and match—would you be able to elaborate with an example? You're no doubt right about the PGO side of things (especially when rustc's internal structures are still quite fluid) and the principle of following Clang generally... I'll give it some more thought! My only inkling for starting from MIR is that it has essentially done a lot of the grunt already, and starting from AST would be duplicating much of that work.

bob-wilson commented 5 years ago

Leaving out the "else" counter reduces the runtime overhead of the instrumentation and produces smaller profile files. There might be some 2nd order effects related to throwing exceptions, but the primary trouble with exceptions is that we thought it would be too expensive to insert a counter after every function call. The issue here is that throwing an exception can essentially cause a difference between the execution count going into a function call and the count coming out of it. The brute force approach to detect that would be to put a counter after every call. You might be able to do something clever with automatically generated exception handlers, but I'm not aware of anyone who has looked into that.

@eggyal : Consider "if let condition" followed by a blank line vs. "match condition" followed by a blank line. What execution count should be displayed for that blank line? For the "if", Clang shows the "then" count. The corresponding behavior for the "match" (or a match lowered from "if let") would be to show the count for the first arm of the match. That just seems wrong to me.

eggyal commented 5 years ago

Okay, I understand... although wouldn't this be more a coverage tool issue rather than an instrumentation issue? Indeed, shouldn't the coverage tool show the execution count of any lines between if/match arms as being either skipped or that of the containing if { } / match { } region itself—i.e. without any need for explicit gap regions?

hanna-kruppe commented 5 years ago

This is definitely outside my area of expertise, but some observations:

While HIR does make some slight simplifications that erase possibly-relevant source-level distinction, some of that information is preserved for use by diagnostics or, possibly, coverage instrumentation. For example, the if let vs match distinction is captured in the MatchSource field of each hir::ExprKind::Match value.
MIR on the other hand is very close to LLVM IR in how much it resembles source code, how it links back to source (only storing bits and pieces relevant for debug info), and how it represents control flow. Therefore, I would not a priori expect that inserting instrumentation in MIR would yield drastically different results from the instrumentation inserted by an LLVM pass (the only difference being the rather mild effects of MIR optimizations). However, inserting it during MIR construction might make sense.

bob-wilson commented 5 years ago

Yes, it's a coverage tool issue. There are a lot of cases like this where there is no clear right answer and the best compromises require understanding the semantics of the code. (I.E., you want the compiler to figure it out instead of relying on some pattern matching in an IDE editor.) The instrumentation part is relatively easier than the coverage mapping, and you definitely want the instrumentation and coverage mapping to be generated from the same ASTs, so be sure to think about the coverage tool issues when deciding where to insert instrumentation.

Regarding the gap regions, I'm not the expert on that part, but I remember that there was quite a lot of iteration trying to come up with solutions that most people were happy with. It's really not easy. For example, for blank lines between match arms, you don't want them to show up as being unexecuted, but using the count from the containing region can give very surprising results, e.g., if none of the match arms ever execute but the blank lines between them have large non-zero counts from the enclosing region. Most people seem to expect that those blank lines will be included as part of the code region right before them, but then you run into problems with gaps after control flow.

@rkruppe Clang does not generate coverage instrumentation from LLVM IR. That's way too low level to do a good job. For gcov-style instrumentation, we used to rely on debug info to get back to source locations, and that really doesn't work well in practice. Debug info tends to be optimized to work well for single stepping in a debugger, and it's not so good for precisely identifying code coverage regions.

pJunger commented 5 years ago

@bob-wilson Obviously this is better for runtime performance & data sizes.

I'm just asking because in the automotive industry every decision & condition have to be covered (including those that throw/panic), so taking over the design from Clang 1:1 would make this instrumentation useless for that. (And this is a bit sad, because otherwise you could interpolate CD/Coverage from those counters and compete with Bullseye Coverage and the likes).

hanna-kruppe commented 5 years ago

@bob-wilson

@rkruppe Clang does not generate coverage instrumentation from LLVM IR. That's way too low level to do a good job. For gcov-style instrumentation, we used to rely on debug info to get back to source locations, and that really doesn't work well in practice. Debug info tends to be optimized to work well for single stepping in a debugger, and it's not so good for precisely identifying code coverage regions.

Yes, I draw the analogy between generating instrumentation from MIR and generating it from LLVM IR to suggest that generating it from MIR won't be any better.

eggyal commented 5 years ago

Regarding the gap regions, I'm not the expert on that part, but I remember that there was quite a lot of iteration trying to come up with solutions that most people were happy with. It's really not easy. For example, for blank lines between match arms, you don't want them to show up as being unexecuted, but using the count from the containing region can give very surprising results, e.g., if none of the match arms ever execute but the blank lines between them have large non-zero counts from the enclosing region. Most people seem to expect that those blank lines will be included as part of the code region right before them, but then you run into problems with gaps after control flow.

Not that I propose addressing it, but this does rather sound like a UI issue, since the execution counts relate to regions rather than lines but (presumably for historical reasons) the tools only display line counts... these problems seem to stem from that disparity? Anyway, I almost have both instrumentation and mapping generation working from MIR and hope from that to glean a much better understanding of these problems! Will then work on moving to the AST instead.

eggyal commented 5 years ago

@bob-wilson: I'm delighted to say that I now recognise the folly of trying to generate these mappings from the MIR: it's not practical to comprehend program flow through the BasicBlocks well enough to establish the code regions. Nevertheless I'm pleased that I undertook the exercise to understand the problem better—thank you for patiently indulging me whilst I caught up with you!

bob-wilson commented 5 years ago

By the way, maybe this goes without saying, but I definitely recommend following the Clang implementation as closely as makes sense. For the instrumentation code, see lib/CodeGen/CodeGenPGO.{h,cpp}. The coverage mapping code is in lib/CodeGen/CoverageMappingGen.{h,cpp}

pJunger commented 5 years ago

I guess one difference will/should be that return and ? needs to be handled?

rust-lang / rust