rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.36k stars 12.72k forks source link

Binaries generate malformed coverage profile when built with `-Z instrument-coverage -C link-dead-code -O` #79175

Closed Mrmaxmeier closed 3 years ago

Mrmaxmeier commented 3 years ago

llvm-cov refuses to load coverage data for this code when running rustc with optimizations enabled:

fn bar() {
    loop {}
}

pub trait Trait {
    fn foo(&self) {
        bar();
    }
}

impl Trait for u8 {}

fn main() {
    println!("hi")
}

Here's how I'm generating the coverage data:

#!/bin/sh
LLVM_TOOLCHAIN=$HOME/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin/

rm -rf main target default.profdata default.profraw

# RUSTFLAGS="-Z instrument-coverage" cargo run --verbose --release

rustc main.rs -O -o main -Z instrument-coverage
./main

$LLVM_TOOLCHAIN/llvm-profdata merge -sparse default.profraw -o default.profdata
$LLVM_TOOLCHAIN/llvm-cov show --instr-profile=default.profdata main

And what I'm observing:

+ rustc main.rs -O -o main -Z instrument-coverage
+ ./main
hi
+ /home/mrmaxmeier/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin//llvm-profdata merge -sparse default.profraw -o default.profdata
+ /home/mrmaxmeier/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin//llvm-cov show --instr-profile=default.profdata main
error: main: Failed to load coverage: Malformed instrumentation profile data

rustc --version --verbose:

rustc 1.50.0-nightly (c919f490b 2020-11-17)
binary: rustc
commit-hash: c919f490bbcd2b29b74016101f7ec71aaa24bdbb
commit-date: 2020-11-17
host: x86_64-unknown-linux-gnu
release: 1.50.0-nightly

cc @richkadel #79121

richkadel commented 3 years ago

Great bug report. I'll take a look.

Thanks!

richkadel commented 3 years ago

Partial update here...

I did some experimentation, making small changes to the code, and turning on and off the -O flag.

I was able to create a modified MCVE, that also appears to make minimal changes to the LLVM IR.

Here is your sample code with a couple of additional lines, commented out. If I uncomment those lines, the example works as expected:

fn bar() {
    loop {}
}

pub trait Trait {
    fn foo(&self) {
     // if true {
            bar();
     // }
    }
}

impl Trait for u8 {}

fn main() {
    println!("hi")
}

If I uncomment these lines (by replacing slashes with spaces, just to minimize the changes to code locations propagated to lower-level representations), there are some small but notable changes:

  1. The symbol definitions for two symbols used by LLVM profiling are gone. Note, the removed variables would have been used by the LLVM instrprof.increment intrinsic to increment counters for code regions in the function Trait::foo():
@__profc__RNvYhNtCs4fqI2P2rA04_11issue_791755Trait3fooB5_ = private global [3 x i64] zeroinitializer, section "__llvm_prf_cnts", align 8
@__profd__RNvYhNtCs4fqI2P2rA04_11issue_791755Trait3fooB5_ = private global { i64, i64, i64*, i8*, i8*, i32, [2 x i16] } { i64 -5234861019717404860, i64 -7566766773726293094, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @__profc__RNvYhNtCs4fqI2P2rA04_11issue_791755Trait3fooB5_, i32 0, i32 0), i8* bitcast (void (i8*)* @_RNvYhNtCs4fqI2P2rA04_11issue_791755Trait3fooB5_ to i8*), i8* null, i32 3, [2 x i16] zeroinitializer }, section "__llvm_prf_data", align 8
  1. The function definition for Trait::foo() is still defined, and nearly unchanged, but 3 lines that increment the counter for Trait::foo() (that used the variables that were removed) are also removed.
  %pgocount = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @__profc__RNvYhNtCs4fqI2P2rA04_11issue_791755Trait3fooB5_, i64 0, i64 1), align 8
  %1 = add i64 %pgocount1, 1
  store i64 %0, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @__profc__RNvYhNtCs4fqI2P2rA04_11issue_791755Trait3fooB5_, i64 0, i64 1), align 8

Those are the only significant changes.

In both the working and non-working examples, everything else is the same (with the exception of a couple of locally-scoped variable symbols, like %pgocount vs. %pgocount1, which can be ignored); including the fact that Trait::foo() is incrementing counters for bla(), in both cases.

Note, bla() has an LLVM IR function definition in both cases as well, but it's never called. I assume Rust inlined the body of bla() into Trait::foo(), which would explain how the counters ended up there.

The @__profc__... and @__profd__... variables for bla() include function pointers to bla(), which may explain why the definition for function bla() cannot be removed.

Intermediate conclusion

I don't have a definitive answer yet, but I'm still investigating. I did reproduce the llvm-cov error message (granted, the message is very vague), without -O, after making a code change in an attempt to add uncalled closures to the coverage map. (Unused closures are normally not passed to codegen.)

I suspect (still to be proven) that the coverage map in the LLVM IR for your original example still includes references to counters from Trait::foo(), but the __prof*__ variables are no longer defined. If I'm right, both failing test cases have the same underlying cause.

cc: @tmandry

richkadel commented 3 years ago

Quick note: PR #79109 includes several improvements, and one important improvement is, we no longer automatically set -Clink-dead-code. (It's no longer recommended or needed for -Z instrument-coverage.)

I tried your sample with the modifications in PR #79109, and no longer get the coverage failures, with -O, unless I add -Clink-dead-code as well.

I think we can mark this Issue resolved, once that PR lands.

tmandry commented 3 years ago

I think we should still track this somewhere, but I can rename the issue to include -C link-dead-code once #79109 lands which would make it much lower priority.

richkadel commented 3 years ago

Note, I suspect there are other combinations of compiler options and optimizations that could break coverage instrumentation, and they may not be easy to predict, but the documented examples should work, and I'll try to make note of other options we expect to work as well (including -O if confirmed for larger samples). I know cargo build --release also works.

richkadel commented 3 years ago

I can rename the issue to include -C link-dead-code once #79109 lands which would make it much lower priority.

@tmandry - Heads up, since #79109 has landed, can you add -C link-dead-code to the title, as you suggested?

Thanks!

briansmith commented 3 years ago

It's no longer recommended or needed for -Z instrument-coverage.)

This is good to know, especially for people migrating from other coverage mechanisms. I didn't find it in the documentation in https://doc.rust-lang.org/nightly/unstable-book/compiler-flags/source-based-code-coverage.html. We should add such a statement to that doc.

briansmith commented 3 years ago

Also https://github.com/rust-lang/rust/issues/64685#issuecomment-737695767:

I implemented coverage for unused functions in a different way, which is actually more complete than anything I was getting from -C link-dead-code. The -C link-dead-code didn't help much before, and now it's redundant.

There is more discussion about this topic in that issue.

I'm guessing that a lot of people can work around this issue by simply removing -C link-dead-code from their code coverage scripts.

richkadel commented 3 years ago

We should add such a statement to that doc.

The document does include the following statement:

Note that some compiler options, combined with -Zinstrument-coverage, can produce LLVM IR and/or linked binaries that are incompatible with LLVM coverage maps. For example, coverage requires references to actual functions in LLVM IR. If any covered function is optimized out, the coverage tools may not be able to process the coverage results. If you need to pass additional options, with coverage enabled, test them early, to confirm you will get the coverage results you expect.

I don't know if there's a strong reason to add a statement about this specific option. I wouldn't object if someone wants to make that change though.

Amanieu commented 3 years ago

This program generates a malformed coverage error from llvm-cov when built with -O -Zinstrument-coverage:

fn foo() {}
fn bar() {}

fn do_stuff(x: bool) {
    if x {
        foo()
    } else {
        bar()
    }
}

fn main() {
    do_stuff(false);
}

llvm-cov works correctly when the program is built without -O.

I think this is the same root cause, but I can open a separate issue if you prefer.

Amanieu commented 3 years ago

I think this might be because of the order in which the LLVM instrumentation pass is executed relative to the other LLVM passes.

Clang runs the "Frontend instrumentation-based coverage lowering" very early before any optimization passes while rustc runs it at the end after all the optimizations.

rustc

$ rustc -O -Zinstrument-coverage test.rs --emit llvm-ir -Z print-llvm-passes
Pass Arguments:  -tti -targetlibinfo -tbaa -scoped-noalias-aa -assumption-cache-tracker -ee-instrument -simplifycfg -domtree -sroa -early-cse -lower-expect
Target Transform Information
Target Library Information
Type-Based Alias Analysis
Scoped NoAlias Alias Analysis
Assumption Cache Tracker
  FunctionPass Manager
    Instrument function entry/exit with calls to e.g. mcount() (pre inlining)
    Simplify the CFG
    Dominator Tree Construction
    SROA
    Early CSE
    Lower 'expect' Intrinsics
Pass Arguments:  -tti -targetlibinfo -tbaa -scoped-noalias-aa -assumption-cache-tracker -profile-summary-info -annotation2metadata -forceattrs -inferattrs -ipsccp -called-value-propagation -globalopt -domtree -mem2reg -deadargelim -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -inline -openmpopt -function-attrs -domtree -sroa -basic-aa -aa -memoryssa -early-cse-memssa -speculative-execution -aa -lazy-value-info -jump-threading -correlated-propagation -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -libcalls-shrinkwrap -loops -postdomtree -branch-prob -block-freq -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -pgo-memop-opt -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -memoryssa -lazy-branch-prob -lazy-block-freq -licm -loop-unswitch -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-idiom -indvars -loop-deletion -loop-unroll -sroa -aa -mldst-motion -phi-values -aa -memdep -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -gvn -sccp -demanded-bits -bdce -basic-aa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -lazy-value-info -jump-threading -correlated-propagation -postdomtree -adce -basic-aa -aa -memoryssa -memcpyopt -dse -loops -loop-simplify -lcssa-verification -lcssa -aa -scalar-evolution -lazy-branch-prob -lazy-block-freq -licm -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -barrier -elim-avail-extern -basiccg -rpo-function-attrs -globalopt -globaldce -basiccg -globals-aa -domtree -float2int -lower-constant-intrinsics -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -loop-accesses -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-distribute -postdomtree -branch-prob -block-freq -scalar-evolution -basic-aa -aa -loop-accesses -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -inject-tli-mappings -loop-vectorize -loop-simplify -scalar-evolution -aa -loop-accesses -lazy-branch-prob -lazy-block-freq -loop-load-elim -basic-aa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -domtree -vector-combine -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-unroll -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -memoryssa -loop-simplify -lcssa-verification -lcssa -scalar-evolution -lazy-branch-prob -lazy-block-freq -licm -opt-remark-emitter -transform-warning -alignment-from-assumptions -strip-dead-prototypes -globaldce -constmerge -mergefunc -cg-profile -domtree -loops -postdomtree -branch-prob -block-freq -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -block-freq -loop-sink -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instsimplify -div-rem-pairs -simplifycfg -instrprof -annotation-remarks
Target Transform Information
Target Library Information
Type-Based Alias Analysis
Scoped NoAlias Alias Analysis
Assumption Cache Tracker
Profile summary info
  ModulePass Manager
    Annotation2Metadata
    Force set function attributes
    Infer set function attributes
    Interprocedural Sparse Conditional Constant Propagation
      FunctionPass Manager
        Dominator Tree Construction
    Called Value Propagation
    Global Variable Optimizer
      FunctionPass Manager
        Dominator Tree Construction
        Natural Loop Information
        Post-Dominator Tree Construction
        Branch Probability Analysis
        Block Frequency Analysis
    FunctionPass Manager
      Dominator Tree Construction
      Promote Memory to Register
    Dead Argument Elimination
    FunctionPass Manager
      Dominator Tree Construction
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Natural Loop Information
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Simplify the CFG
    CallGraph Construction
    Globals Alias Analysis
    Call Graph SCC Pass Manager
      Remove unused exception handling info
      Function Integration/Inlining
      OpenMP specific optimizations
      Deduce function attributes
      FunctionPass Manager
        Dominator Tree Construction
        SROA
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Memory SSA
        Early CSE w/ MemorySSA
        Speculatively execute instructions if target has divergent branches
        Function Alias Analysis Results
        Lazy Value Information Analysis
        Jump Threading
        Value Propagation
        Simplify the CFG
        Dominator Tree Construction
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
        Conditionally eliminate dead library calls
        Natural Loop Information
        Post-Dominator Tree Construction
        Branch Probability Analysis
        Block Frequency Analysis
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        PGOMemOPSize
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Tail Call Elimination
        Simplify the CFG
        Reassociate expressions
        Dominator Tree Construction
        Natural Loop Information
        Canonicalize natural loops
        LCSSA Verifier
        Loop-Closed SSA Form Pass
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Scalar Evolution Analysis
        Loop Pass Manager
          Rotate Loops
        Memory SSA
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Loop Pass Manager
          Loop Invariant Code Motion
          Unswitch loops
        Simplify the CFG
        Dominator Tree Construction
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
        Canonicalize natural loops
        LCSSA Verifier
        Loop-Closed SSA Form Pass
        Scalar Evolution Analysis
        Loop Pass Manager
          Recognize loop idioms
          Induction Variable Simplification
          Delete dead loops
          Unroll loops
        SROA
        Function Alias Analysis Results
        MergedLoadStoreMotion
        Phi Values Analysis
        Function Alias Analysis Results
        Memory Dependence Analysis
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Global Value Numbering
        Sparse Conditional Constant Propagation
        Demanded bits analysis
        Bit-Tracking Dead Code Elimination
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
        Lazy Value Information Analysis
        Jump Threading
        Value Propagation
        Post-Dominator Tree Construction
        Aggressive Dead Code Elimination
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Memory SSA
        MemCpy Optimization
        Dead Store Elimination
        Natural Loop Information
        Canonicalize natural loops
        LCSSA Verifier
        Loop-Closed SSA Form Pass
        Function Alias Analysis Results
        Scalar Evolution Analysis
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Loop Pass Manager
          Loop Invariant Code Motion
        Simplify the CFG
        Dominator Tree Construction
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
    A No-Op Barrier Pass
    Eliminate Available Externally Globals
    CallGraph Construction
    Deduce function attributes in RPO
    Global Variable Optimizer
      FunctionPass Manager
        Dominator Tree Construction
        Natural Loop Information
        Post-Dominator Tree Construction
        Branch Probability Analysis
        Block Frequency Analysis
    Dead Global Elimination
    CallGraph Construction
    Globals Alias Analysis
    FunctionPass Manager
      Dominator Tree Construction
      Float to int
      Lower constant intrinsics
      Dominator Tree Construction
      Natural Loop Information
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Scalar Evolution Analysis
      Loop Pass Manager
        Rotate Loops
      Loop Access Analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Loop Distribution
      Post-Dominator Tree Construction
      Branch Probability Analysis
      Block Frequency Analysis
      Scalar Evolution Analysis
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Loop Access Analysis
      Demanded bits analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Inject TLI Mappings
      Loop Vectorization
      Canonicalize natural loops
      Scalar Evolution Analysis
      Function Alias Analysis Results
      Loop Access Analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Loop Load Elimination
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Simplify the CFG
      Dominator Tree Construction
      Optimize scalar/vector ops
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Natural Loop Information
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Scalar Evolution Analysis
      Loop Pass Manager
        Unroll loops
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Memory SSA
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Scalar Evolution Analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Loop Pass Manager
        Loop Invariant Code Motion
      Optimization Remark Emitter
      Warn about non-applied transformations
      Alignment from assumptions
    Strip Unused Function Prototypes
    Dead Global Elimination
    Merge Duplicate Global Constants
    Merge Functions
    Call Graph Profile
      FunctionPass Manager
        Dominator Tree Construction
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
    FunctionPass Manager
      Dominator Tree Construction
      Natural Loop Information
      Post-Dominator Tree Construction
      Branch Probability Analysis
      Block Frequency Analysis
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Scalar Evolution Analysis
      Block Frequency Analysis
      Loop Pass Manager
        Loop Sink
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Remove redundant instructions
      Hoist/decompose integer division and remainder
      Simplify the CFG
    Frontend instrumentation-based coverage lowering
    FunctionPass Manager
      Annotation Remarks
Pass Arguments:  -domtree
  FunctionPass Manager
    Dominator Tree Construction
Pass Arguments:  -targetlibinfo -domtree -loops -postdomtree -branch-prob -block-freq
Target Library Information
  FunctionPass Manager
    Dominator Tree Construction
    Natural Loop Information
    Post-Dominator Tree Construction
    Branch Probability Analysis
    Block Frequency Analysis
Pass Arguments:  -targetlibinfo -domtree -loops -postdomtree -branch-prob -block-freq
Target Library Information
  FunctionPass Manager
    Dominator Tree Construction
    Natural Loop Information
    Post-Dominator Tree Construction
    Branch Probability Analysis
    Block Frequency Analysis
Pass Arguments:  -targetlibinfo -domtree -loops -lazy-branch-prob -lazy-block-freq
Target Library Information
  FunctionPass Manager
    Dominator Tree Construction
    Natural Loop Information
    Lazy Branch Probability Analysis
    Lazy Block Frequency Analysis

clang

$ clang test.c -fprofile-instr-generate -fcoverage-mapping -S -o ctest.ll -emit-llvm -O3 -mllvm -debug-pass=Structure
Pass Arguments:  -tti -targetlibinfo -tbaa -scoped-noalias -assumption-cache-tracker -ee-instrument -simplifycfg -domtree -sroa -early-cse -lower-expect
Target Transform Information
Target Library Information
Type-Based Alias Analysis
Scoped NoAlias Alias Analysis
Assumption Cache Tracker
  FunctionPass Manager
    Instrument function entry/exit with calls to e.g. mcount() (pre inlining)
    Simplify the CFG
    Dominator Tree Construction
    SROA
    Early CSE
    Lower 'expect' Intrinsics
Pass Arguments:  -tti -targetlibinfo -tbaa -scoped-noalias -assumption-cache-tracker -profile-summary-info -instrprof -forceattrs -inferattrs -domtree -callsite-splitting -ipsccp -called-value-propagation -globalopt -domtree -mem2reg -deadargelim -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -inline -openmpopt -functionattrs -argpromotion -domtree -sroa -basic-aa -aa -memoryssa -early-cse-memssa -aa -lazy-value-info -jump-threading -correlated-propagation -simplifycfg -domtree -aggressive-instcombine -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -libcalls-shrinkwrap -loops -postdomtree -branch-prob -block-freq -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -pgo-memop-opt -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -memoryssa -licm -loop-unswitch -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -indvars -loop-idiom -loop-deletion -loop-unroll -mldst-motion -phi-values -aa -memdep -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -gvn -phi-values -basic-aa -aa -memdep -memcpyopt -sccp -demanded-bits -bdce -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -lazy-value-info -jump-threading -correlated-propagation -basic-aa -aa -phi-values -memdep -dse -aa -memoryssa -loops -loop-simplify -lcssa-verification -lcssa -scalar-evolution -licm -postdomtree -adce -simplifycfg -domtree -basic-aa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -barrier -elim-avail-extern -basiccg -rpo-functionattrs -globalopt -globaldce -basiccg -globals-aa -domtree -float2int -lower-constant-intrinsics -domtree -loops -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -loop-rotate -loop-accesses -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-distribute -postdomtree -branch-prob -block-freq -scalar-evolution -basic-aa -aa -loop-accesses -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -inject-tli-mappings -loop-vectorize -loop-simplify -scalar-evolution -aa -loop-accesses -lazy-branch-prob -lazy-block-freq -loop-load-elim -basic-aa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -domtree -loops -scalar-evolution -basic-aa -aa -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -inject-tli-mappings -slp-vectorizer -vector-combine -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-unroll -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -memoryssa -loop-simplify -lcssa-verification -lcssa -scalar-evolution -licm -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -transform-warning -alignment-from-assumptions -strip-dead-prototypes -globaldce -constmerge -cg-profile -domtree -loops -postdomtree -branch-prob -block-freq -loop-simplify -lcssa-verification -lcssa -basic-aa -aa -scalar-evolution -block-freq -loop-sink -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instsimplify -div-rem-pairs -simplifycfg
Target Transform Information
Target Library Information
Type-Based Alias Analysis
Scoped NoAlias Alias Analysis
Assumption Cache Tracker
Profile summary info
  ModulePass Manager
    Frontend instrumentation-based coverage lowering
    Force set function attributes
    Infer set function attributes
    FunctionPass Manager
      Dominator Tree Construction
      Call-site splitting
    Interprocedural Sparse Conditional Constant Propagation
      FunctionPass Manager
        Dominator Tree Construction
    Called Value Propagation
    Global Variable Optimizer
      FunctionPass Manager
        Dominator Tree Construction
        Natural Loop Information
        Post-Dominator Tree Construction
        Branch Probability Analysis
        Block Frequency Analysis
    FunctionPass Manager
      Dominator Tree Construction
      Promote Memory to Register
    Dead Argument Elimination
    FunctionPass Manager
      Dominator Tree Construction
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Natural Loop Information
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Simplify the CFG
    CallGraph Construction
    Globals Alias Analysis
    Call Graph SCC Pass Manager
      Remove unused exception handling info
      Function Integration/Inlining
      OpenMP specific optimizations
      Deduce function attributes
      Promote 'by reference' arguments to scalars
      FunctionPass Manager
        Dominator Tree Construction
        SROA
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Memory SSA
        Early CSE w/ MemorySSA
        Speculatively execute instructions if target has divergent branches
        Function Alias Analysis Results
        Lazy Value Information Analysis
        Jump Threading
        Value Propagation
        Simplify the CFG
        Dominator Tree Construction
        Combine pattern based expressions
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
        Conditionally eliminate dead library calls
        Natural Loop Information
        Post-Dominator Tree Construction
        Branch Probability Analysis
        Block Frequency Analysis
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        PGOMemOPSize
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Tail Call Elimination
        Simplify the CFG
        Reassociate expressions
        Dominator Tree Construction
        Natural Loop Information
        Canonicalize natural loops
        LCSSA Verifier
        Loop-Closed SSA Form Pass
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Scalar Evolution Analysis
        Loop Pass Manager
          Rotate Loops
        Memory SSA
        Loop Pass Manager
          Loop Invariant Code Motion
          Unswitch loops
        Simplify the CFG
        Dominator Tree Construction
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
        Canonicalize natural loops
        LCSSA Verifier
        Loop-Closed SSA Form Pass
        Scalar Evolution Analysis
        Loop Pass Manager
          Induction Variable Simplification
          Recognize loop idioms
          Delete dead loops
          Unroll loops
        MergedLoadStoreMotion
        Phi Values Analysis
        Function Alias Analysis Results
        Memory Dependence Analysis
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Global Value Numbering
        Phi Values Analysis
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Memory Dependence Analysis
        MemCpy Optimization
        Sparse Conditional Constant Propagation
        Demanded bits analysis
        Bit-Tracking Dead Code Elimination
        Function Alias Analysis Results
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
        Lazy Value Information Analysis
        Jump Threading
        Value Propagation
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Phi Values Analysis
        Memory Dependence Analysis
        Dead Store Elimination
        Function Alias Analysis Results
        Memory SSA
        Natural Loop Information
        Canonicalize natural loops
        LCSSA Verifier
        Loop-Closed SSA Form Pass
        Scalar Evolution Analysis
        Loop Pass Manager
          Loop Invariant Code Motion
        Post-Dominator Tree Construction
        Aggressive Dead Code Elimination
        Simplify the CFG
        Dominator Tree Construction
        Basic Alias Analysis (stateless AA impl)
        Function Alias Analysis Results
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
        Optimization Remark Emitter
        Combine redundant instructions
    A No-Op Barrier Pass
    Eliminate Available Externally Globals
    CallGraph Construction
    Deduce function attributes in RPO
    Global Variable Optimizer
      FunctionPass Manager
        Dominator Tree Construction
        Natural Loop Information
        Post-Dominator Tree Construction
        Branch Probability Analysis
        Block Frequency Analysis
    Dead Global Elimination
    CallGraph Construction
    Globals Alias Analysis
    FunctionPass Manager
      Dominator Tree Construction
      Float to int
      Lower constant intrinsics
      Dominator Tree Construction
      Natural Loop Information
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Scalar Evolution Analysis
      Loop Pass Manager
        Rotate Loops
      Loop Access Analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Loop Distribution
      Post-Dominator Tree Construction
      Branch Probability Analysis
      Block Frequency Analysis
      Scalar Evolution Analysis
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Loop Access Analysis
      Demanded bits analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Inject TLI Mappings
      Loop Vectorization
      Canonicalize natural loops
      Scalar Evolution Analysis
      Function Alias Analysis Results
      Loop Access Analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Loop Load Elimination
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Simplify the CFG
      Dominator Tree Construction
      Natural Loop Information
      Scalar Evolution Analysis
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Demanded bits analysis
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Inject TLI Mappings
      SLP Vectorizer
      Optimize scalar/vector ops
      Optimization Remark Emitter
      Combine redundant instructions
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Scalar Evolution Analysis
      Loop Pass Manager
        Unroll loops
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Combine redundant instructions
      Memory SSA
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Scalar Evolution Analysis
      Loop Pass Manager
        Loop Invariant Code Motion
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Warn about non-applied transformations
      Alignment from assumptions
    Strip Unused Function Prototypes
    Dead Global Elimination
    Merge Duplicate Global Constants
    Call Graph Profile
      FunctionPass Manager
        Dominator Tree Construction
        Natural Loop Information
        Lazy Branch Probability Analysis
        Lazy Block Frequency Analysis
    FunctionPass Manager
      Dominator Tree Construction
      Natural Loop Information
      Post-Dominator Tree Construction
      Branch Probability Analysis
      Block Frequency Analysis
      Canonicalize natural loops
      LCSSA Verifier
      Loop-Closed SSA Form Pass
      Basic Alias Analysis (stateless AA impl)
      Function Alias Analysis Results
      Scalar Evolution Analysis
      Block Frequency Analysis
      Loop Pass Manager
        Loop Sink
      Lazy Branch Probability Analysis
      Lazy Block Frequency Analysis
      Optimization Remark Emitter
      Remove redundant instructions
      Hoist/decompose integer division and remainder
      Simplify the CFG
    Print Module IR
Pass Arguments:  -domtree
  FunctionPass Manager
    Dominator Tree Construction
Pass Arguments:  -targetlibinfo -domtree -loops -postdomtree -branch-prob -block-freq
Target Library Information
  FunctionPass Manager
    Dominator Tree Construction
    Natural Loop Information
    Post-Dominator Tree Construction
    Branch Probability Analysis
    Block Frequency Analysis
Pass Arguments:  -targetlibinfo -domtree -loops -postdomtree -branch-prob -block-freq
Target Library Information
  FunctionPass Manager
    Dominator Tree Construction
    Natural Loop Information
    Post-Dominator Tree Construction
    Branch Probability Analysis
    Block Frequency Analysis
Pass Arguments:  -targetlibinfo -domtree -loops -lazy-branch-prob -lazy-block-freq
Target Library Information
  FunctionPass Manager
    Dominator Tree Construction
    Natural Loop Information
    Lazy Branch Probability Analysis
    Lazy Block Frequency Analysis
Pass Arguments:  -tti
Target Transform Information
  ModulePass Manager
Amanieu commented 3 years ago

@richkadel I've confirmed that reordering the passes fixes this issue.

richkadel commented 3 years ago

Thanks @Amanieu.

It's not clear to me from your posts if you tests this with the changes from #83307. From the timing of your post, I'm assuming so, but I'm also testing this (without your suggested fix) just to be sure, since my PR fixed -O incompatibilities in a lot of other tests.

83307 only landed Thursday morning, and as far as I can tell,rustup update nightly hasn't worked since Thursday, without --force, so most people wouldn't see the fix in nightly yet. But it is available tip-of-tree, if that's how you're testing.

Amanieu commented 3 years ago

Actually you're right, it seems this issue was already fixed by #83307. Still, changing the pass ordering may help with the issues you are seeing with LLVM optimizations breaking coverage results.

Changing the pass order to match Clang may allow you to remove some of the hacks here, here and here.

richkadel commented 3 years ago

Changing the pass order to match Clang may allow you to remove some of the hacks here, here and here.

Hmm, that's an interesting theory and probably worth trying.

How did you change the order? Do you have a patch you can share?

richkadel commented 3 years ago

Can we close this issue as "Fixed"?

Amanieu commented 3 years ago

Here's the patch I'm using: https://github.com/rust-lang/rust/compare/master...Amanieu:instrprof-order

Looking at the generated LLVM IR, it seems that this is fixed by #83307 because the LLVM IR now marks these functions as hidden instead of internal, which preserves them long enough to reach the instrumentation pass after all the optimization passes. Closing as fixed.