Closed pascalkuthe closed 1 year ago
I'd like to keep the fuzzing code running as similar to a release build as possible. My rationale is that memory safety issues are often subtle, and although it's unlikely that different tree constants would cause any issues, it's not impossible.
And as I mentioned in #66, the randomized tests for the Hash
impl belong in the property tests anyway, which already build with the smaller tree constants.
I'd like to keep the fuzzing code running as similar to a release build as possible. My rationale is that memory safety issues are often subtle, and although it's unlikely that different tree constants would cause any issues, it's not impossible.
And as I mentioned in #66, the randomized tests for the
Hash
impl belong in the property tests anyway, which already build with the smaller tree constants.
The reason I submitted this as a PR at all was because it was able to catch #67 which can cause memory issues in release builds too, the fuzzer was just unable to find the case (or did not run long enough). It still took a while even with this fuzzer so a proptest would not have found that bug. Maybe we could put this behind a feature flag so we can fuzz both ways?
I think I need to come back to this PR later. My covid brain + jet lag is leaving me in a weird emotional state, where I just really really want to keep the separation between fuzz testing and property testing very strict. But I don't think I'm evaluating things rationally right now. I'll revisit this when I have my head on straight again.
Okay, so now that my head is on straight again, I agree that this is fine as a fuzz test. However, I do still want the main fuzz testing to use release tree constants, for the reasons I outlined above.
Maybe we could put this behind a feature flag so we can fuzz both ways?
I think something along these lines makes sense. I'm wondering if the fuzz testing framework allows using different feature flags for different fuzz executables? That would be ideal, because then we wouldn't even have to think about it: the fuzz tests that need the smaller tree constants will always use them, and the rest will always use the release constants.
Also: it looks like this PR includes the changes from #67? Can you split that out, so we can handle them separately? Edit: I mean, make this the PR that adds the fuzz testing, and leave #67 to just add the fix.
Yeah I added the change from #67 originally because I assumed that you required fuzzing to pass for merging a PR (and fuzzing fails almost imminently with that fix with the smaller chunk size). I assumed that fix would get merged first. I have dropped the commit now.
I have now put the smaller chunk size behind a feature flag and restored the old fuzz target. I added my modified versin as a new fuzz target that requires the small_chunks
feature flag. To run this new fuzz target you need to use:
cargo +nightly fuzz run mutation_small_chunks --features="small_chunks"
Its a bit ugly but at least cargo will refuse to run the fuzzer if you don't specify this feature flag (while keeping the old fuzz target unchanged).
Once the typos are fixed, this looks ready to merge. Thanks!
Once the typos are fixed, this looks ready to merge. Thanks!
I have fixed the typo and added the period. Should be ready to now :)
Awesome, thanks!
I think that leaves just the lines iterator PR. That one's going to take a bit, because I'd like to make sure I fully understand the implementation.
Awesome, thanks!
I think that leaves just the lines iterator PR. That one's going to take a bit, because I'd like to make sure I fully understand the implementation.
I am currently in the process of fixing documentation there and moving the lines iterator back.
I get that the lines iterator is super hard to review. If that might help with the review I would be happy to discuss implementation details over on matrix aswell. It might help to just casually chat about some details of the implementation. Not as a replacement for actual review comments of course just so that I can help with understanding the implementation faster and you have an easier time reviewing it. Ofourse if you rather prefer not to do that that'ss perfectly fine too, I just wanted to offer since I usually find these discussions helpful
Ah, thanks for the offer! I'll let you know if I think that'll be helpful. But I haven't properly dived into the code yet, so I'll wait until at least trying first. :-)
While working on #66 I fuzzed the hash implementation. I noticed that the fuzzer took extremely long to find any results, because chunk boundaries are quite rare (so huge strings are required). By using
cfg(fuzzing)
in addition tocfg(test)
for enabling a much smaller fuzz size, chunk sizes are much more common and the fuzzer finds problems much quicker. For example with the change the bug from #67 is quickly detected. I have removed themedium.txt
case from the fuzzer because it slows the fuzzing to a crawl due to the huge amount of chunks. I think the main point of the file was to test chunk boundaries anyway which now also occur forsmall.txt
and randomly generated text.