Open sourcefrog opened 2 years ago
So some thoughts:
Coverage stats done per test would currently need a run per test, which would likely be slower. There might be a way to do it by implementing a custom test runner and implementing the profiler built-ins to get the stats with a single threaded runner but that feels like a lot of extra work and another tool entirely. It could be worked out statically to a degree but things like dynamic dispatch/generics etc make this also pretty complex.
With -C instrument-coverage
and processes being spawned you'll end up with multiple profdata
files that need merging. Without specifying the naming pattern for the output file to have something like pid or timestamp added this will lead to spawned processes removing each others results files.
Tarpaulin is working on adding -C instrument-coverage
support as an alternative collector which will handle some of these difficulties for users which they may not be aware of and generating different reporting formats and working with 3rd party reporting tools like coveralls & codecov.
Generally, I think keeping the collection of coverage stats to the users and using that to filter the mutagens is a good first step. That means it can be plugged into existing setups that may use: grcov, kcov, tarpaulin, cargo-llvm-cov, -C code-coverage
or GNATcoverage. The majority of these tools have either lcov, cobertura reports or both (I think GNATcoverage is an outlier in terms of this but haven't used it personally). And for lcov there is a parsing library already https://crates.io/crates/lcov
Unfortunately, I think the option to make it an easy "always on" thing won't work for a large amount of projects that would need bespoke setup. As things start to stabilise and grow in maturity this should be possible, but I think there'll always be a selection of users that have less conventional coverage needs. And for these users being able to provide a pre-generated coverage report to cargo-mutants would probably be the preferred UX.
One example of something I've been planning in is on-device embedded coverage using probe-rs
or an embedded no_std llvm profiler runtime and the embedded rust defmt tools. I wouldn't expect this to make it into the rust compiler at any point as it feels too bespoke, but if I got this working then using cargo-mutants on embedded projects would be pretty cool.
Just my 2¢ :grin:
Thanks @xd009642.
I can't currently think of any practical way to do this, so I'm going to close the bug for now.
I thought about this some more after adding nextest support (#85), which does run one test at a time (more or less) and so would be a foundation for collecting coverage one test at a time.
I agree that it seems like getting coverage working well on any tree seems a bit fiddly today, so this might be hard to make work out of the box.
For the case originally suggested, of just entirely skipping uncovered code, it seems like the best thing would be for users to either add tests for that code, or manually mark it skipped in cargo-mutants. However, perhaps they want to parallelize working towards better tests using both coverage and mutants, rather than one after the other.
I think it could make sense to have an option like --skip-spans
, that avoids generating any mutants for the specified line-col ranges. (It's basically the inverse of --in-diff
.) Then you could potentially feed that from coverage output. It seems like sometimes the mapping of coverage to source location is a bit noisy and heuristic, but this would at least approximately suppress most mutants from uncovered code.
Also, if this just accepted a format-independent list of {file, start: (line, col), end: (line, col)}
then people could convert from whatever coverage or other format they have. We could later, as a convenience, accept some well-known formats.
If we do run one test at a time, perhaps using nextest, and they emit coverage, then we can collect a map from test name to lines covered by that test. (Again, with the caveat that the coverage data is not 100% exact, and that some kinds of test might not collect coverage well.)
By inverting this map we could see which tests could potentially catch a bug in some given line, and then run only those tests. For very large crates this might give a significant improvement in performance, especially if they already expect to be tested under Nextest and so already pay the one-test-at-a-time performance cost.
So just a small comment on some playing around with ideas in this area I'm working on, recently I overhauled tarpaulin's reporting to better get function/method names and do so in a way that matches cargo-mutants, then generated an lcov coverage report as that currently has function names, grab all the functions with 0 hits and put them in a .cargo/mutants.toml
exclude_re
field and that successfully filtered out mutations for functions that were untested. So I do have a workable version of this feature given a bit of script glue between tarpaulin and cargo-mutants
Abridged version of the lcov coverage report
TN:
SF:/home/xd009642/personal/tarpaulin/meta/mutants_tester/src/lib.rs
FN:4,add
FN:19,Foo::five
FN:33,<impl Shiterator for Marker>::next
FN:39,<impl Foo for Marker>::four
FN:45,<impl Foo2 for Marker>::five
FN:51,<impl Display for Wrapper<T>>::fmt
FN:57,<impl Display for Wrapper<T>>::boo
FN:64,Wrapper<T>::unwrap
FN:70,Marker::marked
FN:76,nonsense
FN:77,nonsense::inner
FN:91,tests::it_works
FNF:12
FNDA:1,add
FNDA:0,Foo::five
FNDA:0,<impl Shiterator for Marker>::next
FNDA:0,<impl Foo for Marker>::four
FNDA:0,<impl Foo2 for Marker>::five
FNDA:0,<impl Display for Wrapper<T>>::fmt
FNDA:0,<impl Display for Wrapper<T>>::boo
FNDA:0,Wrapper<T>::unwrap
FNDA:0,Marker::marked
FNDA:0,nonsense
FNDA:0,nonsense::inner
FNDA:1,tests::it_works
Generated mutants.toml
exclude_re = [
"Foo::five",
"<impl Shiterator for Marker>::next",
"<impl Foo for Marker>::four",
"<impl Foo2 for Marker>::five",
"<impl Display for Wrapper<T>>::fmt",
"<impl Display for Wrapper<T>>::boo",
"Wrapper<T>::unwrap",
"Marker::marked",
"nonsense",
"nonsense::inner"
]
Discussed in https://github.com/sourcefrog/cargo-mutants/discussions/23
Yeah, good idea.
mutagen
optionally does something live this according to its documentation, but I have not looked at the implementation.We could take it a step further by understanding which tests run which function under test. Functions not reached by any test we know are just apparently not tested. Functions that are reached by some tests, we can mutate and then run only the relevant tests. This would potentially be dramatically faster on some trees.
That said, I think there are a few things that might make this annoying to implement reliably, but perhaps my preconceptions are out of date. In the past, getting coverage files out of Rust historically to be a bit platform-dependent and fiddly to set up in my experience. And, historically the output was in a platform-dependent format that required external preprocessing. Both of these are in tension with my goal for a very easy start with cargo-mutants.
However there is now https://blog.rust-lang.org/inside-rust/2020/11/12/source-based-code-coverage.html providing
-Z instrument-coverage
, which is moving towards stabilization as-C instrument-coverage
.So if this ends up with a way to just directly get a platform-independent coverage representation out of
cargo
this might be pretty feasible.Coverage may still raise some edge cases if the test suite starts subprocesses, potentially in different directories, as both cargo-mutants and cargo-tarpaulin seem to do. Will we still collect all the aggregate coverage info? But, we could still offer it for trees where it does work well. And maybe it will be fine.
There might also be a hairy bit about mapping from a function name back to the right
cargo test
invocation to hit it. But that also can probably be done: if nothing else perhaps by just running the test binary directly...Possibly this could be done with https://github.com/taiki-e/cargo-llvm-cov