Add fuzzer discovered tests for polygonToCells

isaacbrodsky commented 1 year ago

Adds test cases discovered via fuzzer for polygonToCells. These test cases hit some previously uncovered error cases.

These tests may be somewhat fragile because they aren't built specifically to trigger the offending behavior, but rather do so empirically. Subtle changes could result in the test cases no longer (or not in the first place) exercising the appropriate behavior.

coveralls commented 1 year ago

Coverage: 98.911% (+0.3%) from 98.657% when pulling 38637b32e9094644e1bfecdcfed0d09b26b13ea2 on isaacbrodsky:polygon-to-cells-coverage into 3d4bdc394130b6a4a75194337994d00224bfb232 on uber:master.

nrabinowitz commented 1 year ago

It feels to me like this is going to be a common case - the fuzzer finds uncovered code, or inputs that crash, and we want an easy way to replicate the fuzzer run in test. I don't think it's a great option to copy a bunch of the fuzzer code with small modifications for each fuzzer we want to replicate this way, but I can see there's no particularly easy way to add assertions to the fuzzer code (beyond "returns 0"). Some ideas:

At a minimum, can we import code like populateGeoLoop from the fuzzer, or move it to a shared utility?
Is it possible to refactor the fuzzer code slightly to move more shared code out, or to allow assertions to be injected? One idea here would be to make an inner function for the fuzzer that took an additional *result struct, which it could populate with either function output or error codes from the functions under test, and the tests could reuse those inner functions and make assertions on result.

I'm not sure what the best option is in C, but if we assume that "run fuzzer in test with a few basic assertions" is a common requirement, then we should design the fuzzers to make this simpler.

isaacbrodsky commented 1 year ago

It feels to me like this is going to be a common case - the fuzzer finds uncovered code, or inputs that crash, and we want an easy way to replicate the fuzzer run in test. I don't think it's a great option to copy a bunch of the fuzzer code with small modifications for each fuzzer we want to replicate this way, but I can see there's no particularly easy way to add assertions to the fuzzer code (beyond "returns 0"). Some ideas:
* At a minimum, can we import code like `populateGeoLoop` from the fuzzer, or move it to a shared utility?

* Is it possible to refactor the fuzzer code slightly to move more shared code out, or to allow assertions to be injected? One idea here would be to make an inner function for the fuzzer that took an additional `*result` struct, which it could populate with either function output or error codes from the functions under test, and the tests could reuse those inner functions and make assertions on `result`.
I'm not sure what the best option is in C, but if we assume that "run fuzzer in test with a few basic assertions" is a common requirement, then we should design the fuzzers to make this simpler.

I see two possibilities:

Run the fuzzer directly using the test case. This is supportable today (although we're missing an option for asserting on the error code of the function.) Coverage will show that the test case was exercised and UBSan/valgrind/etc will also be applied.
Decode the fuzzer discovered scenario to one that we can express directly, rather than just reloading the binary blobs. The intention is to address an issue we see in this PR - one of the tests doesn't actually fail on Windows. Alternately, we could consider the lack of failure itself an issue and hold the test constant while finding out why the Windows build doesn't fail on that input.

In general I'm not enthused about directly copying cases from the fuzzer except when they're very simple, because of the opacity and potential brittleness of the test.

isaacbrodsky commented 10 months ago

This PR is on hold after #785, as the new polygonToCells algorithm is slated to replace the old one.

uber / h3

Add fuzzer discovered tests for polygonToCells #742