Closed wupengcheng6819 closed 2 years ago
The issue is likely due to 32-bit arithmetic overflow (somewhere in look-ahead / prefetch logic). Note, maximum theoretical size would be 2GB - 1 (due to EOF / sentinel symbol), but I recommend switching to libsais64 few KBs before 2GB limit. And for my own compressors (bsc and bsc-m03) I limit block size by 2047 MB.
As it turns out, the program crashes only with extra space. In my case, the file size is 2047MB, which runs fine with 0 extra space; The extra space size is around 1G when it crashes, and the debug error message is:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004098bd in libsais_compact_unique_and_nonunique_lms_suffixes_32s ()
Thank you. Now I know where problem is. n + fs value is overflowing signed 32-bit integer. I will add fix for this in next few days by capping fs parameter values to correct range. That said, for large files (>100MB) you typically do not need any extra free space as libsais should be able to carve enough unused space inside suffix array itself.
Fixed in 2.6.5 (Capped free space parameter to avoid crashing due to 32-bit integer overflow).
Thanks for the fix and worked like a charm!
As an old user looking forward to switch from libdivsufsort, I noticed libsais would crash as file size approaches 2GB (with or without giving extra space), while divsufsort won't as long as file size is strictly under 2G (210241024*1024). I wonder what is the max size doable without switching to 64-bit version.