tukaani-project / xz

XZ Utils
https://tukaani.org/xz/
Other
532 stars 95 forks source link

[Bug]: Undefined behavior: NULL + 0 #36

Closed QrczakMK closed 1 year ago

QrczakMK commented 1 year ago

Describe the bug

When given empty buffers with next_in == NULL and avail_in == 0 (or same for out), liblzma sometimes computes NULL + 0, which has undefined behavior in C (it is defined in C++).

This gets detected by ubsan.

The following places are affected (possibly more):

https://github.com/tukaani-project/xz/blob/913ddc5572b9455fa0cf299be2e35c708840e922/src/liblzma/common/common.c#L291-L297 The first 3 lines can be guarded with if (in_pos != 0), the last 3 lines can be guarded with if (out_pos != 0).

https://github.com/tukaani-project/xz/blob/75f1a6c26df4ce329da0882786403e3ccf5cd898/src/liblzma/common/block_encoder.c#L81 Write in_start != 0 ? in + in_start : in.

Version

5.4.1

Operating System

Linux

Relevant log output

No response

Larhzu commented 1 year ago

I have read about this a bit now. Sounds like it probably needs to be fixed. Quite a few functions have to be reviewed to spot all such cases as there definitely are more than those you already found. XZ Embedded needs to be reviewed too.

At least with a trivial test program, the method in in_start != 0 ? in + in_start : in; seems to be optimized to the same code as in + in_start, at least with modern GCC and Clang. So there won't be an extra branch in reality.

To me this seems like a bug in the standard that could have been fixed by adding an extra sentence to explicitly allow null-pointer + 0. Based on search engine results, it seems that it was decided that it's better to change hundreds of codebases instead, hopefully spotting every problematic case. Feels a bit similar to the memcpy(NULL, NULL, 0) issue that was (hopefully) fixed in XZ Utils in 2019.

Thanks for reporting this!

Larhzu commented 1 year ago

It should be fixed now. Thanks!

XZ Embedded has the same problem. The initial plan is to fix it by changing the API documentation to say that the input and output buffer pointers must not be NULL even for empty buffers. First the code in the Linux kernel has to be checked if NULLs are used in xzdec* calls.