lloyd / yajl

A fast streaming JSON parsing library in C.
http://lloyd.github.com/yajl
ISC License
2.15k stars 435 forks source link

Fix CVE-2022-24795 #240

Open ppisar opened 2 years ago

ppisar commented 2 years ago

There was an integer overflow in yajl_buf_ensure_available() leading to allocating less memory than requested. Then data were written past the allocated heap buffer in yajl_buf_append(), the only caller of yajl_buf_ensure_available(). Another result of the overflow was an infinite loop without a return from yajl_buf_ensure_available().

yajl-ruby project, which bundles yajl, fixed it https://github.com/brianmario/yajl-ruby/pull/211 by checking for the integer overflow, fortifying buffer allocations, and report the failures to a caller. But then the caller yajl_buf_append() skips a memory write if yajl_buf_ensure_available() failed leading to a data corruption.

A yajl fork mainter recommended calling memory allocation callbacks with the large memory request and let them to handle it. But that has the problem that it's not possible pass the overely large size to the callbacks.

This patch catches the integer overflow and terminates the process with abort().

https://github.com/lloyd/yajl/issues/239 https://github.com/brianmario/yajl-ruby/security/advisories/GHSA-jj47-x69x-mxrm

robohack commented 2 years ago

Note that there is no such thing as an "overly large size" here -- the left shift guarantees a maximum allowable buffer allocation size of 2^((sizeof(int)*CHAR_BIT)-1) bytes, and otherwise it "overflows" to zero and stays at zero.

So, either way the newly calculated "need" size is passed to the yaf->realloc function, and either it is a valid non-zero number, or it is zero; and the latter condition is sufficient for the realloc function to detect the overflow.

Calling abort() for a failure which can already be detected and handled by the realloc function introduces a hard failure that then cannot be avoided and handled differently by the realloc function, and that alone would cause YAJL to be unusable in several embedded systems projects I've used it on in the past. Use of abort() in general purpose libraries should be forbidden except via <assert.h.>.

As for the infinite loop, yes, that's definitely a bug too (and would actually prevent any heap corruption!). It is easily fixed by adding the following term to the condition for the while loop doing the shift: need > 0 &&

(Without the new term the loop probably exhibits Undefined Behaviour, though UBSAN hasn't caught it in past testing I've done.)

ppisar commented 2 years ago

V Thu, Apr 07, 2022 at 11:07:19AM -0700, Greg A. Woods napsal(a):

Note that there is no such thing as an "overly large size" here -- the left shift guarantees a maximum allowable buffer allocation size of 2^((sizeof(int)*CHAR_BIT)-1) bytes, and otherwise it "overflows" to zero and stays at zero.

So, either way the newly calculated "need" size is passed to the yaf->realloc function, and either it is a valid non-zero number, or it is zero; Yes.

and the latter condition is sufficient for the realloc function to detect the overflow.

No.

0 is a valid input for realloc(). POSIX claims:

If  the  size  of the space requested is zero, the behavior shall be
implementation-defined

Hence you cannot expect a realloc callback to recognize 0 as a mistake of yajl.

Calling abort() for a failure which can already be detected and handled by the realloc function introduces a hard failure that then cannot be avoided and handled differently by the realloc function, and that alone would cause YAJL to be unusable in several embedded systems projects I've used it on in the past. Use of abort() in general purpose libraries should be forbidden except via <assert.h.>.

I fully agree. But in case of yajl's memory callbacks, there is is no way of reporting a failure from the callback to yajl.

I only inspected libbson which used to bundle yajl and it does abort().

-- Petr

robohack commented 2 years ago

It's not realloc(3) that needs to detect the problem, but rather yaf->(realloc*)(), as called by YA_REALLOC(). Zero is never a valid input for YA_REALLOC(). The initial buffer size starts at YAJL_BUF_INIT_SIZE.

robohack commented 2 years ago

BTW, the allocator wrapper functions can report an error to the calling application and escape with setjmp(), and if they also implement a pool-style allocator cleanup can even be done by freeing the whole pool. This is one possible purpose of the extra context pointer.

JanZerebecki commented 2 years ago

It's not realloc(3) that needs to detect the problem, but rather yaf->(realloc*)(), as called by YA_REALLOC().

That would be an API incompatible change, as it currently does not need to. So for avoiding the problem with current code abort() seems to be preferable.

( For a new version that breaks compatibility by using a new yaf with a different name, your suggestion might work, but I think instead making it explicit to the caller and passing the error via the return path instead of a callback and setjmp might reduce the chanche for errors when using the library. )

robohack commented 2 years ago

I think given the fact the default in YAJL is not to do any error reporting on memory allocation failures (other than via optional calls to assert()) it's not really an API change to allow the user to add actual error handling by providing a more sophisticated YA_REALLOC(). The API has always allowed for such enhancements (and I have no doubt this feature has been made use of).

In any case, calling abort() from the library is equally an intrusive and arguably unnecessary behaviour change. At the very least you need to use assert() instead of a raw abort so as to allow the library to be compiled with or without such checks, as the user desires.

YAJL's API could be re-designed to make more advanced memory allocation error handling easier, but that's way beyond the scope of fixing this little infinite loop problem.