google / sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer
Other
11.49k stars 1.03k forks source link

MSAN memcmp false positive ? #1389

Open catenacyber opened 3 years ago

catenacyber commented 3 years ago

From https://github.com/google/oss-fuzz/issues/4538 cc @morehouse

Inspired by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=26226

Reproducer program is

#include <string.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    int a[5];
    int b[5];

    a[0] = 0;
    b[0] = 1;
    if (memcmp(a, b, sizeof(int)+1)) {
        printf("diff\n");
    }
    return 0;
}

Compiled with clang++ -fsanitize=memory -fsanitize=fuzzer-no-link m.c

Run issues

Uninitialized bytes in MemcmpInterceptorCommon at offset 4 inside [0x7ffe57d25350, 5)
==72412==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x44cbce in memcmp /src/llvm-project/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:873:10
    #1 0x49a3f7 in main (/src/a.out+0x49a3f7)
    #2 0x7f24b40ac83f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2083f)
    #3 0x41c368 in _start (/src/a.out+0x41c368)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /src/llvm-project/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:873:10 in memcmp
Exiting

I understand that MSAN issues this warning because memcmp will try to read more than the uninitialized bytes at once. But this cannot change the behavior of the program. So, is this a false positive ?

eugenis commented 3 years ago

@zygoloid points out that

http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_431 Here's text from the C committee, on a somewhat-related issue: "The original intention of the atomics design was to allow memcmp and memcpy (with suitable synchronization) be a common implementation for all _Atomic objects. This practice, however, would lead to undefined behavior for implementations that have padding bits for integer representations." So the C committee thinks that memcmp involving padding bits results in undefined behavior.

On the other hand, C11 in 7.24.1/3 states that

For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value).

I'm inclined to say that your program is valid (or at least not clearly invalid) and memcpy in MSan needs to be relaxed.

yugr commented 3 years ago

As a temp workaround I think you can use export MSAN_OPTIONS=strict_memcmp=false.

catenacyber commented 3 years ago

Thanks, I will try this

uecker commented 2 years ago

The C standard specified that all bytes are compared in memcmp. It also still notes that atomic_exchange_compare corresponds to a memcmp/memcpy pair. Padding bytes in structs take unspecified values on writes. If you want to know what the C standard says, please look at the C standard and not at historical documents. If the DR did not lead to a change to the C standard, then this is an indication that the committee did not think a change was necessary (or there was no consensus). The current normative wording is what counts.