uber / h3

Hexagonal hierarchical geospatial indexing system
https://h3geo.org
Apache License 2.0
4.84k stars 459 forks source link

Add fuzzer discovered tests for polygonToCells #742

Open isaacbrodsky opened 1 year ago

isaacbrodsky commented 1 year ago

Adds test cases discovered via fuzzer for polygonToCells. These test cases hit some previously uncovered error cases.

These tests may be somewhat fragile because they aren't built specifically to trigger the offending behavior, but rather do so empirically. Subtle changes could result in the test cases no longer (or not in the first place) exercising the appropriate behavior.

coveralls commented 1 year ago

Coverage Status

Coverage: 98.911% (+0.3%) from 98.657% when pulling 38637b32e9094644e1bfecdcfed0d09b26b13ea2 on isaacbrodsky:polygon-to-cells-coverage into 3d4bdc394130b6a4a75194337994d00224bfb232 on uber:master.

nrabinowitz commented 1 year ago

It feels to me like this is going to be a common case - the fuzzer finds uncovered code, or inputs that crash, and we want an easy way to replicate the fuzzer run in test. I don't think it's a great option to copy a bunch of the fuzzer code with small modifications for each fuzzer we want to replicate this way, but I can see there's no particularly easy way to add assertions to the fuzzer code (beyond "returns 0"). Some ideas:

I'm not sure what the best option is in C, but if we assume that "run fuzzer in test with a few basic assertions" is a common requirement, then we should design the fuzzers to make this simpler.

isaacbrodsky commented 1 year ago

It feels to me like this is going to be a common case - the fuzzer finds uncovered code, or inputs that crash, and we want an easy way to replicate the fuzzer run in test. I don't think it's a great option to copy a bunch of the fuzzer code with small modifications for each fuzzer we want to replicate this way, but I can see there's no particularly easy way to add assertions to the fuzzer code (beyond "returns 0"). Some ideas:

* At a minimum, can we import code like `populateGeoLoop` from the fuzzer, or move it to a shared utility?

* Is it possible to refactor the fuzzer code slightly to move more shared code out, or to allow assertions to be injected? One idea here would be to make an inner function for the fuzzer that took an additional `*result` struct, which it could populate with either function output or error codes from the functions under test, and the tests could reuse those inner functions and make assertions on `result`.

I'm not sure what the best option is in C, but if we assume that "run fuzzer in test with a few basic assertions" is a common requirement, then we should design the fuzzers to make this simpler.

I see two possibilities:

In general I'm not enthused about directly copying cases from the fuzzer except when they're very simple, because of the opacity and potential brittleness of the test.

isaacbrodsky commented 10 months ago

This PR is on hold after #785, as the new polygonToCells algorithm is slated to replace the old one.