zserge / jsmn

Jsmn is a world fastest JSON parser/tokenizer. This is the official repo replacing the old one at Bitbucket
MIT License
3.72k stars 783 forks source link

heap buffer overflow on some inputs #125

Closed hongxuchen closed 5 years ago

hongxuchen commented 6 years ago

We found a heap buffer overflow bug on jsondump with our fuzzing tool FOT's semantic related functionality, with the following commands (memory leak of jsondump we think is not a big issue):

export ASAN_OPTIONS=detect_leaks=0
CFLAGS="-O3 -g -fsanitize=address" LDFLAGS="-fsanitize=address" make jsondump

One simple test case is as follows, which is mutated based on library.json.

{
  "name": "jsmn",
  "keywords": "json",
  "description": "Minimalistic JSON parser/tokenizer in C. It : "jsmn",
can be easily integrated into resource-limited or embedded projects",
  "repository":
  {
    "type": "git",
    "url": "https://github.com/zserge/jsmn.git"
  },
  "frameworks": "*",
  "platforms": "*",
  "examples": [
    "example/*.c"
  ],
  "exclude": "test"
}

The error output is:

=================================================================
==28237==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x615000000280 at pc 0x00000050cad3 bp 0x7ffd1eb95810 sp 0x7ffd1eb95808
READ of size 4 at 0x615000000280 thread T0
    #0 0x50cad2 in dump /home/hongxu/tests/jsmn/example/jsondump.c:33:9
    #1 0x50c9d5 in dump /home/hongxu/tests/jsmn/example/jsondump.c:44:9
    #2 0x50bed3 in main /home/hongxu/tests/jsmn/example/jsondump.c:120:4
    #3 0x7faee422e1c0 in __libc_start_main /build/glibc-itYbWN/glibc-2.26/csu/../csu/libc-start.c:308
    #4 0x41c5d9 in _start (/home/hongxu/tests/jsmn/jsondump+0x41c5d9)

0x615000000280 is located 0 bytes to the right of 512-byte region [0x615000000080,0x615000000280)
allocated by thread T0 here:
    #0 0x4d2045 in realloc (/home/hongxu/tests/jsmn/jsondump+0x4d2045)
    #1 0x50bdbc in realloc_it /home/hongxu/tests/jsmn/example/jsondump.c:15:12
    #2 0x50bdbc in main /home/hongxu/tests/jsmn/example/jsondump.c:113
    #3 0x7faee422e1c0 in __libc_start_main /build/glibc-itYbWN/glibc-2.26/csu/../csu/libc-start.c:308

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/hongxu/tests/jsmn/example/jsondump.c:33:9 in dump
Shadow bytes around the buggy address:
  0x0c2a7fff8000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a7fff8010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2a7fff8020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2a7fff8030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2a7fff8040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c2a7fff8050:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a7fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a7fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a7fff8080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a7fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a7fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==28237==ABORTING

There are other input tests that can trigger this bug with similar root cause (*.input.txt is the input file, and *.err.txt is the error output). w01_000011,sig:6,Splice:3:16,src:w02_000052.err.txt w01_000011,sig:6,Splice:3:16,src:w02_000052.input.txt w02_000003,sig:6,Havoc:21:35,src:w01_000052.err.txt w02_000003,sig:6,Havoc:21:35,src:w01_000052.input.txt

pt300 commented 6 years ago

From my session of code reading and debugging I came to the conclusion that mistakes lie on both sides but example program is the one at fault. Out of bounds reads are caused by dump function assuming that each key has a value token next to it. That causes loop on line 43 to read more than it's supposed to as t->size only states how many keys object holds, yet loop assumes it states how many key/value pairs object has. Simple fix is to add condition like if(last->size != 0 && (last->type == JSMN_STRING || last->type == JSMN_PRIMITIVE)) before printing the value. In my case that made valgrind happy again. For the library itself I think '\"' and '\'' should be added as possible terminators of primitive in non strict mode.

pt300 commented 6 years ago

Also, that's out of bounds read and not buffer overflow.

hongxuchen commented 6 years ago

@pt300 Basically I just reported the error according to the sanitizer and hadn't had time to dig into root cause. I think your analysis leads to the bug fix.

Ulmo commented 6 years ago

I experienced a bug which I just reported at #137 which I think might be an example of this out of bounds read.

pt300 commented 6 years ago

If you read the comments you'd realise that's not the case

Ulmo commented 6 years ago

Are you sure?

pt300 commented 6 years ago

Yes, I am sure. This problem is related to the example program assuming that each key in object has a value, which isn't the case in the buggy json OP provided. In your case it's something different and I'm not sure what exactly is it yet.

BenBE commented 6 years ago

I did some analysis with AFL (well, I just compiled for AFL with ASAN enabled and waited for things to crash) and didn't have to wait for very long (actually it crashed within seconds).

From the example inputs to jsondump (which I used for my tests) I gathered the following examples for crashes (one crashing input per line):

{0}
{{}}
{0 0}0
{}:0
{""0}0
pt300 commented 6 years ago

@BenBE if these crashes are unrelated to this issue then just create new one with more informations, otherwise what you wrote is not helping here.

BenBE commented 6 years ago

@pt300 Looking at the output of those sample inputs it seems very likely they are related. Comparing with the ASAN output @HongxuChen posted there's little doubt they aren't related. Just posted a list of minimal samples to ease reproduction of the issue. This also gives a list of other, related inputs to check when fixing the issue.

Other bugs might be hidden due to this parsing issue.

pt300 commented 6 years ago

I've checked your inputs and yes they are caused by the same thing. And so fix proposed by me still prevents these crashes. I'll try to make a pull request I guess.

pt300 commented 5 years ago

Fix was added with conversion of jsmn to .h only library