xd009642 / tarpaulin

A code coverage tool for Rust projects
https://crates.io/crates/cargo-tarpaulin
Apache License 2.0
2.51k stars 180 forks source link

memory leak when merging reports #1639

Open flavio opened 1 day ago

flavio commented 1 day ago

Describe the bug

We're using tarpaulin to collect coverage of end-to-end test of kwctl, a Rust cli tool.

The tests are running fine and produce many profraw files. I've seen we are generating 138(!!!) files, maybe we're doing something stupid?!

It looks like there's a memory leak when these files are aggregated. The "Merging coverage reports" step takes a long time and, while running, the memory usage increases significantly. On my system I've witnessed tarpaulin consume up to 28 GB of memory.

When running inside of a GitHub Action, tarpaulin is killed because it's consuming too much memory.

To Reproduce

We're using tarpaulin 0.31.2, with Rust 1.82.0, on x86_64 Linux.

The issue can be reproduced in this way:

  1. Checkout kwctl code
  2. Ensure docker daemon is running. This is needed to start the registry image locally.
  3. Run make coverage-e2e-tests or just cargo tarpaulin --verbose --skip-clean --engine=llvm --all-features --implicit-test-threads --test e2e --out xml --out html --output-dir coverage/e2e-tests

You can also run these command by prepending GNU Time to have a rough idea about the amount of memory being used:

/usr/bin/time -v make coverage-e2e-tests

Expected behavior

The usage of memory should not be so high.

xd009642 commented 1 day ago

So each executable ran will generate a prof raw, so how rust's test stuff decides to split things into binaries and any spawned processes built as part of the tests will generate one.

As for the memory usage, unfortunately rust flags are applied to every crate in your dependency tree and all of them will be instrumented for coverage information as a result. This can result in an enormous amount of bloat with a full dependencies instrumented with counters that are often just 0 if it's library functionality you never use...

Still there is likely some room for me to minimise this some more... Out of curiosity have you tried with cargo-llvm-cov and seen if it hits similarly high memory usage?

flavio commented 1 day ago

Still there is likely some room for me to minimise this some more... Out of curiosity have you tried with cargo-llvm-cov and seen if it hits similarly high memory usage?

I've tried cargo-llvm-cov, it's memory consumption is significantly lower:

    Finished report saved to /home/flavio/hacking/kubernetes/kubewarden/kwctl/target/llvm-cov/html
    Command being timed: "cargo llvm-cov --html"
    User time (seconds): 1710.16
    System time (seconds): 34.59
    Percent of CPU this job got: 590%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 4:55.32
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 3972312
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 995
    Minor (reclaiming a frame) page faults: 5570003
    Voluntary context switches: 86879
    Involuntary context switches: 423985
    Swaps: 0
    File system inputs: 136
    File system outputs: 12062096
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0