nothings / stb

stb single-file public domain libraries for C/C++
https://twitter.com/nothings
Other
26.31k stars 7.69k forks source link

Adds small valid .pic file to fuzzer seed corpus; varies req_comp when fuzz-testing #1455

Open NBickford-NV opened 1 year ago

NBickford-NV commented 1 year ago

Hi stb maintainers!

The goal of this pull request is to modify the fuzzer so that it can catch bugs like #1452. If I've put this together correctly, the CIFuzz automation should fail after finding the PIC crash in #1452 fairly quickly, then succeed once a fix for #1452 has been merged in.

I think there's two reasons why fuzz-testing didn't find issue #1452:

  1. The nullptr dereference within stbi__convert_format only occurs if req_comp is not 0 or 4. Previously, the fuzzing function always set req_comp to 4.
  2. There were no PIC files in the fuzzer seed corpus. The fuzzer may succeed in guessing the first PIC magic number, buy my guess is the second PIC magic number (PICT) 84 bytes later is hard for the fuzzer to guess.

So, this pull request

  1. Uses the last byte of the input (N bytes long with N > 1) to choose the value of req_comp to test with, and passes the initial N-1 bytes to stbi_load_from_memory as the image to read. (Using the last byte of the input as both the last byte of the image and the byte that determines req_comp might be a better idea; I'm not 100% sure about the best approach here.)
  2. Adds a 1x1-pixel PIC file I made to the stb_image tests, and includes it in the fuzzer seed corpus. (This file contains an all-0 comment field, and 1 8bpp uncompressed packet that writes 0s to the RGB fields of the pixel.)

Testing locally, libFuzzer can now find the crash in #1452 in about a second.

Thanks!

NBickford-NV commented 1 year ago

Looks like this worked — there's the PIC nullptr dereference! https://github.com/nothings/stb/actions/runs/4270488208/jobs/7434315794#step:5:982

================== Job 0 exited with exit code 77 ============
Dictionary: 16 entries
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1337
INFO: Loaded 1 modules   (3670 inline 8-bit counters): 3670 [0x663e80, 0x664cd6), 
INFO: Loaded 1 PC tables (3670 PCs): 3670 [0x60d840,0x61bda0), 
INFO:     8606 files found in /github/workspace/cifuzz-corpus/stbi_read_fuzzer
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
INFO: seed corpus: files: 8606 min: 1b max: 1989777b total: 11532072b rss: 41Mb
#4096   pulse  cov: 1104 ft: 2811 corp: 694/12729b exec/s: 151 rss: 232Mb
AddressSanitizer:DEADLYSIGNAL
=================================================================
==39==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000005a6206 bp 0x7ffe94771a10 sp 0x7ffe94771[970](https://github.com/nothings/stb/actions/runs/4270488208/jobs/7434315794#step:5:971) T0)
==39==The signal is caused by a READ memory access.
==39==Hint: address points to the zero page.
SCARINESS: 10 (null-deref)
    #0 0x5a6206 in stbi__convert_format(unsigned char*, int, int, unsigned int, unsigned int) /src/stb/tests/../stb_image.h:1785:52
    #1 0x58a9ea in stbi__pic_load(stbi__context*, int*, int*, int*, int, stbi__result_info*) /src/stb/tests/../stb_image.h:6535:11
    #2 0x57e1cd in stbi__load_main(stbi__context*, int*, int*, int*, int, stbi__result_info*, int) /src/stb/tests/../stb_image.h:1159:35
    #3 0x56cbef in stbi__load_and_postprocess_8bit(stbi__context*, int*, int*, int*, int) /src/stb/tests/../stb_image.h:1261:19
    #4 0x57cab7 in stbi_load_from_memory /src/stb/tests/../stb_image.h:1431:11
    #5 0x57cab7 in LLVMFuzzerTestOneInput /src/stb/tests/stbi_read_fuzzer.c:29:26

I was a bit worried about this because I saw a timeout while another run was loading the corpus - looks like there are some JPEG files that specify fewer than 20 million texels (from the limit in stbi_read_fuzzer.c), but take more than 25 seconds to load (likely 60-70 seconds, based on some slow units locally; maybe the 80000000 constant could be reduced to around 10000000?): https://github.com/nothings/stb/actions/runs/4270185040/jobs/7433830850#step:5:148

One interesting thing is that I think it found the nullptr dereference while analyzing the existing ossfuzz corpus. Maybe ossfuzz found a way in the past to synthesize a file with the two PIC magic numbers, and my hypothesis in point 2 about the second magic number being hard for ossfuzz's fuzzers to guess was wrong? (In other words, maybe the only change needed was allowing the requested channels to vary in stbi_read_fuzzer.c?)