IlyaGrebnov / libsais

libsais is a library for linear time suffix array, longest common prefix array and burrows wheeler transform construction based on induced sorting algorithm.
Apache License 2.0
180 stars 22 forks source link

Overlapping parameters are passed to memcpy #3

Closed akiutoslahti closed 3 years ago

akiutoslahti commented 3 years ago

I was playing around with libsais and Silesia corpus when I encountered a possibility for overlapping parameters passed to memcpy. This can happen precisely at libsais_reconstruct_compacted_lms_suffixes_32s_2k_omp libsais/src/libsais.c:3912.

Whether this causes problems or not, seems to depend on environment. In an environment where calls to memcpy are redirected to memmove this does not seem to cause problems and is only detected with AddressSanitizer. However in an environment where memcpy is actually utilized, this causes either a corrupted SA and/or segmentation fault, and is detected also with Valgrind.

I was able to isolate the part of the payload causing this error and wrote a minimal reproducer for it. Payload is a 32kiB block extracted from concatenation of Silesia corpus contents in alphabetical order.

How to use reproducer: Compile with gcc -o reproducer -g -fsanitize=address reproducer.c libsais.c Run with ./reproducer payload

If there is some information that I could provide additionally, please feel free to ask.

IlyaGrebnov commented 3 years ago

Hi @akiutoslahti, thank you for reporting this issue! I committed fix for it by switch from memcpy to memmove as you recommended. I did some some extensive fuzzing with clang libFuzzer and surprisingly were never able to hit this issue :(

akiutoslahti commented 3 years ago

@IlyaGrebnov No problem, happy to be of help! :)