rust-fuzz / honggfuzz-rs

Fuzz your Rust code with Google-developed Honggfuzz !
https://crates.io/crates/honggfuzz
Apache License 2.0
448 stars 40 forks source link

refactor: use docopt, anyhow, fs-err, which crates to simplify and st… #51

Closed drahnr closed 3 years ago

drahnr commented 3 years ago

…ructure

This remedies some of the internal TODOs and yields better error messages.

Motivation is, to allow additional verifyable arguments to enhance practical usability without having to re-read various help texts on which param gets passed to which process.

drahnr commented 3 years ago

So docopt does not allow an unparsed chunk of trailing params as it seems :facepalm: going for clap next then.

Works

drahnr commented 3 years ago

Not sure what causes the linkage error __sanitizer_cov_trace_const_cmp8

drahnr commented 3 years ago

This is ready for review :)

PaulGrandperrin commented 3 years ago

Hello @drahnr, This is a big PR with lots of changes! Too big for me to review at once. I'll start with the first commit which already contains a few different things:

drahnr commented 3 years ago
* **docopt** : according to https://github.com/docopt/docopt.rs  this crate is unmaintained and the author recommends using `clap` or `structopt`. I'm already using `structopt` in other projects and I really like it but I thought that it might be a little overkill for such a simple use-case. For this simple project, I try very hard to only add dependencies when really useful and when the crate enjoys a healthy state of maintenance.
  I don't know how much `structopt` would add to the compile-time, but I fear that pulling `serde` and co will add a lot for little benefit.

Mostly went for it since I am in the same situation with docopt, and I like the fact that one writes the help message essentially, the rest is derived. It's an improvement over the current state of arg parsing either way. Given today's caching and the time needed to fuzz code vs build time, imho that argument is not really valid for a fuzzing wrapper :upside_down_face: I'd be in favor of merging it, it can be simplified in a few places before doing so - and if wanted somebody can replace it with smth that keeps a simpler approach if desired at some point in the future. Args is a decent input representation.

* **anyhow** : I like this crate and maybe it's worth adding! I'll add a few comments in the code later, but I saw many changes making the status code of the child process being printed out to `stdout` instead of being properly "forwarded" to the parent process exit code. My goal here was to make the `honggfuzz-rs`  wrapper as thin and transparent as possible (to minimize surprises and also maintenance).

In that case, an explicit encoding with thiserror and error variants might be the better approach. I would rather have it a bit thicker and provide more output as needed.

* **fs-err** : The crate idea is nice but it'd still rather stick to the `std` for the reasons cited above.

I disagree. fs-err adds very little overhead yet if something goes wrong, yields a significant improvement in the error message. It always pays off.

* **which**  is nice :)

:heavy_check_mark:

* `BuildType::ProfileWithGrcov` and `BuildType::Debug` have been refactored in a way that does not preserve the semantics of the previous code. Now `BuildType::Debug` will compile in `--release` mode which is not correct.

Err, I'll fix that right away. EDIT: I thought I fixed that, either way, postponed, see below.

* This commit also includes a lot of format changes and other minor changes which I guess might partially come from `cargo fmt` and/or `cargo clippy`. It would be easier to review them in separate and later commits, or better, PRs.

I tried to exclude most of these, but it seems I missed a few.

* It seems your PR is based on an obsolete version of `master` because some of the latest changes I made have been deleted by your commit somehow (see [a6cb56d](https://github.com/rust-fuzz/honggfuzz-rs/commit/a6cb56d450751695dd4fe259845800fec24eac94))

Yeah, I think that happend during a rebase.

Going to split this into smaller atomic PRs then.

PaulGrandperrin commented 3 years ago

Mostly went for it since I am in the same situation with docopt, and I like the fact that one writes the help message essentially, the rest is derived.

I don't get it, why not using structopt instead? It is the recommended approach and de-facto standard

Given today's caching and the time needed to fuzz code vs build time, imho that argument is not really valid for a fuzzing wrapper

I was thinking about the cargo build/cargo check time. The vast majority of the time, the devs and the CI will launch these commands without actually starting a fuzzing session. If I am not mistaken, adding dependencies will add to the developer's cycle duration and the CI load. Or it might incentivize them to remove the fuzzing setup code from their check/build path.

In that case, an explicit encoding with thiserror and error variants might be the better approach. I would rather have it a bit thicker and provide more output as needed.

Indeed but this part of the project is not a library and is basically a small script file with an easy stop-when-there-is-an error approach to error handling. I don't feel it is needed to formalize too much error paths. There is little to gain but code complexity.

I disagree. fs-err adds very little overhead yet if something goes wrong, yields a significant improvement in the error message. It always pays off.

I don't feel that way. Adding dependencies is always a liability too:

I would love if the fs-err devs upstreamed their work to std. Moreover, I would already use their crate for projects where difficult-to-debug file errors are a common issue but this is not the case here. std general maintenance quality cannot be understated.

Going to split this into smaller atomic PRs then.

That would be greatly appreciated, thx :)

drahnr commented 3 years ago

Mostly went for it since I am in the same situation with docopt, and I like the fact that one writes the help message essentially, the rest is derived.

I don't get it, why not using structopt instead? It is the recommended approach and de-facto standard

Given today's caching and the time needed to fuzz code vs build time, imho that argument is not really valid for a fuzzing wrapper

I was thinking about the cargo build/cargo check time. The vast majority of the time, the devs and the CI will launch these commands without actually starting a fuzzing session. If I am not mistaken, adding dependencies will add to the developer's cycle duration and the CI load. Or it might incentivize them to remove the fuzzing setup code from their check/build path.

In that case, an explicit encoding with thiserror and error variants might be the better approach. I would rather have it a bit thicker and provide more output as needed.

Indeed but this part of the project is not a library and is basically a small script file with an easy stop-when-there-is-an error approach to error handling. I don't feel it is needed to formalize too much error paths. There is little to gain but code complexity.

I disagree. fs-err adds very little overhead yet if something goes wrong, yields a significant improvement in the error message. It always pays off.

I don't feel that way. Adding dependencies is always a liability too:

* new versions break stuff

* bugs

It's a CLI wrapper, as such adding a few proven and 1.x versions don't imply having to chase the latest and greatest. If you can cleanup code and make it clearer and DRYer, I personally always prefer a crate or macro. But that might be just me :shrug:

* need to periodically check that the project is still maintained

Point taken.

* security issues (intentional or not) and trust are of importance for the kind of people using a fuzzer

Is that really relevant here? It's a CLI wrapper tool of a fuzzer - exposure is really minimal :)

* build time

Build time is a factor, but the dimensions here are really tiny, and as said, it's a fuzzer. If you want to avoid the dependencies for the lib splitting it into two indepedent crates can cut down the compilation time for the lib (which is what's going to be run repeatedly, rather than the binary wrapper) would yield some gain without limiting oneself on the crates to be used. I am happy to refactor the codebase in that direction if such a change would be welcomed :)

I would love if the fs-err devs upstreamed their work to std. Moreover, I would already use their crate for projects where difficult-to-debug file errors are a common issue but this is not the case here.

Puh, that's a broad statement and I disagree. A single occurence of missing permission and being faced without a path already is worth the dependency.

std general maintenance quality cannot be understated.

Not saying that at all, I am just looking at the gain/pain ratio and that's (imho) pretty good for fs-err.

Going to split this into smaller atomic PRs then.

That would be greatly appreciated, thx :)

Split it into a bunch of atomic commits, the change of anyhow and docopt is a bit intertwined to that's one commit right now. Could you note which commits you definitely want, and which ones are a maybe so I don't have to open 7 PRs just to then end up rebasing n-1 of them :)

Happy weekend, and thanks for you time reviewing this!

drahnr commented 3 years ago

I'll replace the docopt argparse with structopt at some point this week :crossed_fingers:

PaulGrandperrin commented 3 years ago

I haven't had the time to look at your other commits yet but I can tell you more about my "philosophy" and plans about the CLI. I was planning to redesign the CLI interface to make it a little more ergonomic. Right now the arguments to the wrapper are given on the command line and the arguments to the upstream fuzzer are given through an environment variable. I like that clear separation and I'd like to keep it but I'd like to make it nicer.

My plan was to change from this

HFUZZ_RUN_ARGS="-v -N 10000000 --run_time 120 --exit_upon_crash" cargo hfuzz run example

to this

cargo hfuzz run example -- -v -N 10000000 --run_time 120 --exit_upon_crash

where the delimiter would be --. This is the more standard way that I know for doing that king of things.

drahnr commented 3 years ago

I haven't had the time to look at your other commits yet but I can tell you more about my "philosophy" and plans about the CLI. I was planning to redesign the CLI interface to make it a little more ergonomic. Right now the arguments to the wrapper are given on the command line and the arguments to the upstream fuzzer are given through an environment variable. I like that clear separation and I'd like to keep it but I'd like to make it nicer.

My plan was to change from this

HFUZZ_RUN_ARGS="-v -N 10000000 --run_time 120 --exit_upon_crash" cargo hfuzz run example

Aware, and this is a rather annoying property of the wrapper right now, since every usage of honggfuzz (the lib) requires the same readme repeatedly. So I am all for getting rid of the requirement to pass the options that way and provide an alternative, standard path.

This is the more standard way that I know for doing that king of things.

I fully agree. That's what the docopt changes also would have impl'd (mostly due to a lack of any different impl).


I split this into two PRs, #54 and #53 without docopt since you are already working on the structopt impl - I am happy to review that as needed :) or take over if you don't have the time. The mentioned PRs should go in before the full structopt impl or splitting the crate into two.