IlyaGrebnov / libsais

libsais is a library for linear time suffix array, longest common prefix array and burrows wheeler transform construction based on induced sorting algorithm.
Apache License 2.0
186 stars 24 forks source link

libsais64 crash with 3.9 GB file #27

Open sb98052 opened 2 weeks ago

sb98052 commented 2 weeks ago

I am getting a crash when using libsais64 on an input file of size 3.9GB. The code works for small files. For debugging, I am running it with extra space equal to the space allocated for the suffix array. This is the stack trace.

==1692879==ERROR: AddressSanitizer: SEGV on unknown address 0x7f39c8447178 (pc 0x00000029
855f bp 0x00000595e6ee sp 0x7ffde4489540 T0)
==1692879==The signal is caused by a READ memory access.
SCARINESS: 20 (wild-addr-read)
    #0 0x29855f in libsais_partial_sorting_scan_left_to_right_32s_6k lln/third-party/libsais/src/libsais.c:2340
    #1 0x29855f in libsais_partial_sorting_scan_left_to_right_32s_6k_omp //
/lln/third-party/libsais/src/libsais.c:2769
    #2 0x29855f in libsais_induce_partial_order_32s_6k_omp lln/th
ird-party/libsais/src/libsais.c:3824
    #3 0x29855f in libsais_main_32s_recursion lln/third-party/lib
sais/src/libsais.c:6293
    #4 0x29a41f in libsais_main_32s_recursion lln/third-party/lib
sais/src/libsais.c:6354
    #5 0x29e074 in libsais_main_32s_entry lln/third-party/libsais
/src/libsais.c:6477
    #6 0x29e074 in libsais_main_int lln/third-party/libsais/src/l
ibsais.c:6541
    #7 0x29e074 in libsais_int lln/third-party/libsais/src/libsai
s.c:6646
    #8 0x2ab2ed in libsais64_main_32s_recursion lln/third-party/l
ibsais/src/libsais64.c:6318
    #9 0x2a9a86 in libsais64_main_32s_entry lln/third-party/libsa
is/src/libsais64.c:6533
    #10 0x2a9a86 in libsais64_main_8u lln/third-party/libsais/src
/libsais64.c:6558
    #11 0x2a9a86 in libsais64_main lln/third-party/libsais/src/li
bsais64.c:6583
    #12 0x2a8734 in libsais64 lln/third-party/libsais/src/libsais
64.c:6684
    #13 0x291136 in create_lite_context_for_data64 lln/lln.c:521
    #14 0x2902a5 in main lln/lln-bin.c:175
    #15 0x7f4a4682c656 in __libc_start_call_main /home/engshare/third-party2/glibc/2.34/s
rc/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #16 0x7f4a4682c717 in __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc
/2.34/src/glibc-2.34/csu/../csu/libc-start.c:409:3
    #17 0x28fd20 in _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../s
ysdeps/x86_64/start.S:116

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV lln/third-party/libsais/src/libsa
is.c:2340 in libsais_partial_sorting_scan_left_to_right_32s_6k

I would be happy to provide more inputs or try changes.

IlyaGrebnov commented 2 weeks ago

@sb98052 Yes, it looks like there’s a bug in libsais. If you can share the input file along with the compiler version and settings, that would be ideal. The line itself points to the likely cause, but having the file will help me confirm and validate the fix.

IlyaGrebnov commented 2 weeks ago

Hi @sb98052, could you provide the input file along with the compiler settings? I’m having a bit of difficulty reproducing the issue on my side. Thank you very much!

sb98052 commented 2 weeks ago

Thank you. I'm working on putting together a repro for you.

IlyaGrebnov commented 1 week ago

Hi @sb98052, would you be able to provide the input file along with the compiler settings?