better fuzzing - Githubissues

pascalkuthe commented 2 years ago

While working on #66 I fuzzed the hash implementation. I noticed that the fuzzer took extremely long to find any results, because chunk boundaries are quite rare (so huge strings are required). By using cfg(fuzzing) in addition to cfg(test) for enabling a much smaller fuzz size, chunk sizes are much more common and the fuzzer finds problems much quicker. For example with the change the bug from #67 is quickly detected. I have removed the medium.txt case from the fuzzer because it slows the fuzzing to a crawl due to the huge amount of chunks. I think the main point of the file was to test chunk boundaries anyway which now also occur for small.txt and randomly generated text.

cessen commented 2 years ago

I'd like to keep the fuzzing code running as similar to a release build as possible. My rationale is that memory safety issues are often subtle, and although it's unlikely that different tree constants would cause any issues, it's not impossible.

And as I mentioned in #66, the randomized tests for the Hash impl belong in the property tests anyway, which already build with the smaller tree constants.

pascalkuthe commented 2 years ago

I'd like to keep the fuzzing code running as similar to a release build as possible. My rationale is that memory safety issues are often subtle, and although it's unlikely that different tree constants would cause any issues, it's not impossible.

And as I mentioned in #66, the randomized tests for the Hash impl belong in the property tests anyway, which already build with the smaller tree constants.

The reason I submitted this as a PR at all was because it was able to catch #67 which can cause memory issues in release builds too, the fuzzer was just unable to find the case (or did not run long enough). It still took a while even with this fuzzer so a proptest would not have found that bug. Maybe we could put this behind a feature flag so we can fuzz both ways?

cessen commented 2 years ago

I think I need to come back to this PR later. My covid brain + jet lag is leaving me in a weird emotional state, where I just really really want to keep the separation between fuzz testing and property testing very strict. But I don't think I'm evaluating things rationally right now. I'll revisit this when I have my head on straight again.

cessen commented 1 year ago

Okay, so now that my head is on straight again, I agree that this is fine as a fuzz test. However, I do still want the main fuzz testing to use release tree constants, for the reasons I outlined above.

Maybe we could put this behind a feature flag so we can fuzz both ways?

I think something along these lines makes sense. I'm wondering if the fuzz testing framework allows using different feature flags for different fuzz executables? That would be ideal, because then we wouldn't even have to think about it: the fuzz tests that need the smaller tree constants will always use them, and the rest will always use the release constants.

Also: it looks like this PR includes the changes from #67? Can you split that out, so we can handle them separately? Edit: I mean, make this the PR that adds the fuzz testing, and leave #67 to just add the fix.

pascalkuthe commented 1 year ago

Yeah I added the change from #67 originally because I assumed that you required fuzzing to pass for merging a PR (and fuzzing fails almost imminently with that fix with the smaller chunk size). I assumed that fix would get merged first. I have dropped the commit now.

I have now put the smaller chunk size behind a feature flag and restored the old fuzz target. I added my modified versin as a new fuzz target that requires the small_chunks feature flag. To run this new fuzz target you need to use:

cargo +nightly fuzz run mutation_small_chunks --features="small_chunks"

Its a bit ugly but at least cargo will refuse to run the fuzzer if you don't specify this feature flag (while keeping the old fuzz target unchanged).

cessen commented 1 year ago

Once the typos are fixed, this looks ready to merge. Thanks!

pascalkuthe commented 1 year ago

Once the typos are fixed, this looks ready to merge. Thanks!

I have fixed the typo and added the period. Should be ready to now :)

cessen commented 1 year ago

Awesome, thanks!

I think that leaves just the lines iterator PR. That one's going to take a bit, because I'd like to make sure I fully understand the implementation.

pascalkuthe commented 1 year ago

Awesome, thanks!

I think that leaves just the lines iterator PR. That one's going to take a bit, because I'd like to make sure I fully understand the implementation.

I am currently in the process of fixing documentation there and moving the lines iterator back.

I get that the lines iterator is super hard to review. If that might help with the review I would be happy to discuss implementation details over on matrix aswell. It might help to just casually chat about some details of the implementation. Not as a replacement for actual review comments of course just so that I can help with understanding the implementation faster and you have an easier time reviewing it. Ofourse if you rather prefer not to do that that'ss perfectly fine too, I just wanted to offer since I usually find these discussions helpful

cessen commented 1 year ago

Ah, thanks for the offer! I'll let you know if I think that'll be helpful. But I haven't properly dived into the code yet, so I'll wait until at least trying first. :-)

cessen / ropey

better fuzzing #68