Suggestion: Use a fuzzer / quickcheck for the auto tests?

porglezomp commented 6 years ago

I see there's:

// TODO tweak these functions to generate edge cases (zero, infinity, NaN) more often

I believe quickcheck or proptest probably give a way to do this, and potentially shrinking would help understand the failing cases more easily.

On another front, using a fuzzer to generate the cases lets it do coverage-guided testing, which has some chance of maybe doing a better job exploring all code paths?

It might also be possible to get Seer to help find those code paths, since we aren't really doing allocation and pointers it should be able to do a pretty good job with these?

After I get my current PR working, I'd be interested in trying to make some of these improvements, if any sound good to you.

porglezomp commented 6 years ago

Also, we can probably exhaustively test the one-argument cases, although we wouldn't want to do it often.

japaric commented 6 years ago

The thing is that we have to compare against the MUSL implementation, otherwise we can't check for exact equality (it's usually not a good idea to check for exact equality when dealing with floats; the only reason we can do that here is that we are using the exact same implementation as MUSL's).

Cross, which we are using to test other architectures via emulation, only supports -gnu versions of non-x86 architectures. If we compute the expected output in the test runner (which is what #[quickcheck] would do) then we would be comparing against gnu libm and that may produce test failures even on correct ports of MUSL implementations (due to different handling of edge cases, etc.).

test-generator produces the expected outputs using the x86_64-musl target regardless of which architecture is being tested. This way we are always testing against the MUSL implementation.

So we can't directly use quickcheck. I think however that we could use quickcheck's StdGen / Arbitrary to improve the generation of floats.

Using a fuzzer or seer to generate test cases makes sense but I think it would be way too costly to run on each PR (it would probably break Travis' time limit). We could however run it offline / locally and commit the test cases to the repository.

porglezomp commented 6 years ago

Fun data point: Running the standard Rust f32::sin on every 2^32 bit patterns only takes 27 seconds, so it'd be possible to exhaustively test all of the fn (f32) -> f32 on every single input (obviously only on architectures where you can directly run the MUSL implementation).

rust-lang / libm

Suggestion: Use a fuzzer / quickcheck for the auto tests? #82