HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
246 stars 27 forks source link

Stack overflow seg fault in calculate_clair3_full_alignment #265

Closed ymcki closed 8 months ago

ymcki commented 9 months ago

I ran a high coverage sample and got seg fault with clair3.py CallVariantsFromCffi running 1.0.4

Running a version of python3.9-debug allowed me to pinpoint the crash at clair3_full_alignment.c:656 which is a local array declaration. HAP read_hap_array[reads_num];

At the time of crash, the value of reads_num was 387271

typedef struct HAP { size_t read_index; size_t haplotype; } HAP;

The definition of HAP reveals that it has a total of 16 bytes in my system. That means read_hap_array's memory foot print is 6,196,336 bytes. Since the stack size of my system is by default 8MB, I suspect this allocation plus other allocation in calculate_clair3_full_alignment causes stack overflow and hence seg fault.

Then I ran 'ulimit -s 16384' to double my stack size and Clair3 could run to completion without error.

My understanding is that this bug persists in 1.0.5. It would be great if future version of Clair3 can check if reads_num is greater than a certain number and then throw a stack overflow error and remind users to set increase their stack size instead of a mysterious seg fault.

aquaskyline commented 9 months ago

Billion thanks to this report. We will try repeating the segfault. If successful, in the next version, we will add caps and checks around that line.

aquaskyline commented 8 months ago

Fixed in v1.0.6.