DanBloomberg / leptonica

Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The official github repository for Leptonica is: danbloomberg/leptonica. See leptonica.org for more documentation.
Other
1.79k stars 391 forks source link

Out-of-bounds read in numaFindLocForThreshold #627

Closed Teemperor closed 1 year ago

Teemperor commented 2 years ago

Explanation of what I think goes wrong (see //NOTE: comments)

    n = numaGetCount(na);
    fa = numaGetFArray(na, L_NOCOPY);
    pval = fa[0];
    // NOTE: 1. loop terminates with i == length(fa)
    for (i = 1; i < n; i++) {
        val = fa[i];
        index = L_MIN(i + skip, n - 1);
        jval = fa[index];
        if (val < pval && jval < pval)  /* near the top if not there */
            break;
        pval = val;
    }

        /* Look for the low point in the valley */
    start = i; // NOTE: start = i = length(fa)
    pval = fa[start]; // NOTE: out of bounds read (off by one)
    for (i = start + 1; i < n; i++) {
        val = fa[i];
        if (val <= pval) {  /* going down */
            pval = val;
        } 

To reproduce:

  1. Get oss-fuzz
  2. Patch oss-fuzz repo with this diff. This is necessary as Clang optimizes out the read (GCC however doesn't)
    diff --git a/projects/leptonica/build.sh b/projects/leptonica/build.sh
    index ee51bdc..0bee357 100755
    --- a/projects/leptonica/build.sh
    +++ b/projects/leptonica/build.sh
    @@ -1,4 +1,6 @@
    #!/bin/bash -eu
    +export CFLAGS="$CFLAGS -O0 -fno-builtin -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION"
    +export CXXFLAGS="$CXXFLAGS -O0 -fno-builtin -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION"
    # Copyright 2018 Google Inc.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
  3. Build the binarize_fuzzer.
  4. Run ./binarize_fuzzer poc_input over the attached fuzzer input from the zip.

Sanitizer output below:

oss-fuzz/build/out/leptonica/binarize_fuzzer results/crash-802920ff79f29e3bb16e725fe17039ec4e558a96
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 485387694
INFO: Loaded 1 modules   (106335 inline 8-bit counters): 106335 [0x1a27f20, 0x1a41e7f), 
INFO: Loaded 1 PC tables (106335 PCs): 106335 [0x1a41e80,0x1be1470), 
oss-fuzz/build/out/leptonica/binarize_fuzzer: Running 1 inputs 1 time(s) each.
Running: results/[leptonica_oob_poc.zip](https://github.com/DanBloomberg/leptonica/files/8903734/leptonica_oob_poc.zip)
=================================================================
==3574404==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x619000000e80 at pc 0x0000006438e7 bp 0x7ffcb92ce430 sp 0x7ffcb92ce428
READ of size 4 at 0x619000000e80 thread T0
    #0 0x6438e6 in numaFindLocForThreshold /src/leptonica/src/numafunc2.c:2623:12
    #1 0x568ed8 in pixThresholdByHisto /src/leptonica/src/binarize.c:1183:5
    #2 0x55d953 in LLVMFuzzerTestOneInput /src/leptonica/prog/fuzzing/binarize_fuzzer.cc:48:2
    #3 0x455423 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
    #4 0x4410b2 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    #5 0x4468fc in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9
    #6 0x46f4f2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #7 0x7f069a62928f  (/usr/lib/libc.so.6+0x2928f)
    #8 0x7f069a629349 in __libc_start_main (/usr/lib/libc.so.6+0x29349)
    #9 0x41f8dd in _start (/home/fuzzer/oss-fuzz/build/out/leptonica/binarize_fuzzer+0x41f8dd)

0x619000000e80 is located 0 bytes to the right of 1024-byte region [0x619000000a80,0x619000000e80)
allocated by thread T0 here:
    #0 0x523e92 in __interceptor_calloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:138:3
    #1 0x5f672a in numaCreate /src/leptonica/src/numabasic.c:201:35
    #2 0x62de88 in numaTransform /src/leptonica/src/numafunc2.c:418:16
    #3 0x568e44 in pixThresholdByHisto /src/leptonica/src/binarize.c:1179:11
    #4 0x55d953 in LLVMFuzzerTestOneInput /src/leptonica/prog/fuzzing/binarize_fuzzer.cc:48:2
    #5 0x455423 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
    #6 0x4410b2 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    #7 0x4468fc in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9
    #8 0x46f4f2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #9 0x7f069a62928f  (/usr/lib/libc.so.6+0x2928f)

SUMMARY: AddressSanitizer: heap-buffer-overflow /src/leptonica/src/numafunc2.c:2623:12 in numaFindLocForThreshold
Shadow bytes around the buggy address:
  0x0c327fff8180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff81a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff81b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff81c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c327fff81d0:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff81e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff81f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8210: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8220: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==3574404==ABORTING
DanBloomberg commented 2 years ago

Thank you for finding the problem and reporting it in such detail.

Looking at the function, and its use (only in pixThresholdByHisto), there are other problems besides indexing beyond the array. pixThresholdByHisto() is supposed to return 0 for the threshold if it can't find a good one. But numaFindLocForThreshold() is not designed to return 0 on failure to find a minimum value. And there are other places in numaFindLocForThreshold() where array indexing can go bad.

I will fix it within two days.

DanBloomberg commented 2 years ago

(well, 2 weeks, anyway :-) Fixed.

amitdo commented 1 year ago

Dan, you forgot to close this issue.

DanBloomberg commented 1 year ago

you beat me to it by 30 seconds :-)

stweil commented 1 year ago

I'm so sorry. :-)