RoaringBitmap / CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
http://roaringbitmap.org/
Other
1.56k stars 267 forks source link

add fuzz testing #122

Open lemire opened 6 years ago

lemire commented 6 years ago

http://lcamtuf.coredump.cx/afl/

K2 commented 6 years ago

I took a quick stab at this and started up 48 instances in a parallel afl fuzz.

I'm using the afl-fuzz mode only right now, the perf can be improved a bit, lots of options with M/ASAN and HARDEN are possible. Some of the newer features work best with llvm-svn and I've updated a few things there, some errors in my libcxx I can look again later.

I chose the portability test case since there are 2 pre-existing test bin's and I would think the serialization path here is one of our higher priority items, made use of the bitmapwithruns.bin/bitmapwithoutruns.bin files even though they are a bit large (would be better for much smaller versions) I didn't want to originate too much churn since we have tests for these already.

I took out the cmocka as it seemed to be conflicting due to it's signal handlers and added a few tweaks to accept a command line argument, as AFL works typically with file I/O.

There's a few output's I'll triage them and see which ones are real, I might need to tweak some of the memory ulimit's since that can cause false-positives here.

/*
 * afl_format_portability_unit.c
 *
 */

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <roaring/roaring.h>

#include "config.h"

char filename[1024];

long filesize(FILE* fp) { 
    fseek(fp, 0L, SEEK_END);
    return ftell(fp);
}

char* readfile(FILE* fp) {
    long bytes = filesize(fp);
    char* buf = malloc(bytes);
    if(buf == NULL) return NULL;

    rewind(fp);

    int cnt = fread(buf, 1, bytes, fp);
    if(bytes != cnt){
        free(buf);
        return NULL;
    }
    return buf;
}

int compare(char* x, char* y, size_t size) {
    for (size_t i = 0; i < size; ++i) {
        if (x[i] != y[i]) {
            return i + 1;
        }
    }
    return 0;
}

int test_deserialize(FILE* fp) {
    char* input_buffer = readfile(fp);

    if(input_buffer == NULL) return -1; 

    roaring_bitmap_t* bitmap =
        roaring_bitmap_portable_deserialize(input_buffer);

    if(bitmap == NULL) {
        free(input_buffer);
        return -2;
    }

    size_t expected_size = roaring_bitmap_portable_size_in_bytes(bitmap);

    char* output_buffer = malloc(expected_size);
    size_t actual_size =
        roaring_bitmap_portable_serialize(bitmap, output_buffer);

    // maybe we should crash here to catch these
    if(actual_size != expected_size) {
        free(input_buffer);
        free(output_buffer);
        return -3;
    }

    int compare_result = compare(input_buffer, output_buffer, actual_size);

    free(output_buffer);
    free(input_buffer);
    fclose(fp);
    roaring_bitmap_free(bitmap);
    return compare_result;
}

int main(int argc, char **argv) {
    if(argc < 2)
        return 1;

    snprintf(filename, sizeof(filename)-1, "%s", argv[argc-1]);

    FILE* fp = fopen(filename, "rb");

    if(fp == NULL) 
        return 2;

    return test_deserialize(fp);
}

I've got a couple of these boxes I can suspend some of my XMR mining on ;) for a bit, the execution speed would improve with smaller test cases, however it's still around 1500 or so that's over 4Million tests per minute.

image

lemire commented 6 years ago

+1

K2 commented 6 years ago

Here's a set of crashes that still work on the latest commit, their all minimized/reduced from a larger set.

I've got the 64 cores on this all week so we will see if there are any additional ones. I've also added CodeCoverage.cmake from the cmake-modules git and will rebuild the above test case with that so we can keep it 100%.

If anybody wan'ts to work out a PR for it go ahead I'm out for a bit but will be monitoring fuzzing remote and may have some time towards Easter.

crashes.zip

lemire commented 6 years ago

Ok. So these are binary inputs that make the library crash? Ok. I'll see about adding them to our tests.

lemire commented 6 years ago

Done: https://github.com/RoaringBitmap/CRoaring/blob/master/tests/robust_deserialization_unit.c

Please use roaring_bitmap_portable_deserialize_safe because roaring_bitmap_portable_deserialize assumes you have a valid input.

Your 6 cases are now going to be part of our unit tests from now on.

lemire commented 6 years ago

I'll issue a new release shortly.

lemire commented 6 years ago

Done. Please test against v0.2.38.

K2 commented 6 years ago

Re-running now, all the inputs and using robust/safe call.

I did get one SAN trigger from clang during setup.

null.zip

Binary content read.
CRoaring/src/containers/run.c:682:12: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:43:28: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior CRoaring/src/containers/run.c:682:12 in
Running out of bytes while reading a run container.
Null bitmap loaded.
lemire commented 6 years ago

I cannot verify any problem. I added crashproneinput7.bin to our test banks, this is your file: https://github.com/RoaringBitmap/CRoaring/tree/master/tests/testdata

It gets tested: https://github.com/RoaringBitmap/CRoaring/blob/master/tests/robust_deserialization_unit.c#L160

Everything passes.

K2 commented 6 years ago

Yeah, it was one of clang's analyzer's there's a chance it was a false.

I ran everything for the past day nothing so far, I'll view the coverage data to see how well these tests had progressed.

K2 commented 6 years ago

I noticed a few AV from some of the recent merges, I'll make some more passes with the new tests, hopefully we already coverage :) ... I was thinking @lemire what do you think about sending in a request to https://github.com/google/oss-fuzz/ that would give us CI on the fuzz side of the house?

lemire commented 6 years ago

what do you think about sending in a request to https://github.com/google/oss-fuzz/ that would give us CI on the fuzz side of the house?

I am always favorable to more testing. What does it entail?

lemire commented 3 years ago

@AE1020 We might want to close this issue off since it seems you have contributed some fuzz testing.

K2 commented 2 years ago

@lemire I've gone through and setup an integration into oss-fuzz, I'll submit the PR to them if you think this is a good testcase.

#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include "roaring/roaring.h"

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size){
    roaring_statistics_t stats;
    bool answer = true;
    roaring_bitmap_t* bitmap = roaring_bitmap_portable_deserialize_safe(data, size);
    if(bitmap) {
        uint64_t card1 = roaring_bitmap_get_cardinality(bitmap);
        roaring_bitmap_statistics(bitmap, &stats);
        unsigned universe_size = stats.max_value + 1;
        roaring_bitmap_t *inverted = roaring_bitmap_flip(bitmap, 0U, universe_size);
        if(inverted) {
            roaring_bitmap_t *double_inverted = roaring_bitmap_flip(inverted, 0U, universe_size);
            if(double_inverted)
            {
                answer = (roaring_bitmap_get_cardinality(inverted) + roaring_bitmap_get_cardinality(bitmap) == universe_size);
                if (answer) answer = roaring_bitmap_equals(bitmap, double_inverted);
                if (!answer) {
                    printf("Bad flip\n\nbitmap1:\n");
                    roaring_bitmap_printf_describe(bitmap);  // debug
                    printf("\n\nflipped:\n");
                    roaring_bitmap_printf_describe(inverted);  // debug
                }
                roaring_bitmap_free(double_inverted);
            }
            roaring_bitmap_free(inverted);
        }
        roaring_bitmap_free(bitmap);
    }
    return 0;
}

Some initial run's gave me some crashes if you want to look;


==106639==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62100001dcf8 at pc 0x0000005d1bad bp 0x7ffc944c3ff0 sp 0x7ffc944c3fe8
WRITE of size 2 at 0x62100001dcf8 thread T0
    #0 0x5d1bac in array_container_negation_range /src/croaring/src/containers/mixed_negation.c:148:35
    #1 0x587d3e in container_not_range /src/croaring/include/roaring/containers/containers.h:2109:17
    #2 0x587d3e in insert_flipped_container /src/croaring/src/roaring.c:1949:13
    #3 0x58750b in roaring_bitmap_flip /src/croaring/src/roaring.c
    #4 0x55bc6b in LLVMFuzzerTestOneInput /src/croaring_fuzzer.c:15:38
    #5 0x456b13 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) cxa_noexception.cpp
    #6 0x4562fa in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) cxa_noexception.cpp
    #7 0x457b9b in fuzzer::Fuzzer::MutateAndTestOne() cxa_noexception.cpp
    #8 0x458685 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile> >&) cxa_noexception.cpp
    #9 0x447dbd in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) cxa_noexception.cpp
    #10 0x470dc2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #11 0x7f1e0db70fcf in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #12 0x7f1e0db7107c in __libc_start_main csu/../csu/libc-start.c:409:3
    #13 0x41f69d in _start (/home/files/git/oss-fuzz/build/out/croaring/croaring_fuzzer+0x41f69d)

We can integrate the test code into our repository as well.

lemire commented 2 years ago

@K2 We definitely want it but before implementing fuzz testing, we’d want to clear the bugs found so we get a clean check at least initially.

Can you give me the reproducible test case matching the error you are reporting. We should start by fixing it.

So if you can just open issues corresponding to the bugs you have exposed…

K2 commented 2 years ago

I did the PR for oss-fuzz on an initial test case, I do think we should be working on the bugs from these various API as we see from issue #336 and others may be found as we tighten up issues found by OSS-FUZZ also.

K2 commented 2 years ago

@lemire I mentioned you in https://github.com/google/oss-fuzz/pull/7144 for approval to integrate into oss-fuzz with the minimal fuzzer we went over a few weeks ago.

lemire commented 2 years ago

Apologies. I did comment now.