DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
464 stars 113 forks source link

`hisat-3n-table` report segmentation fault for certain cigar. #324

Closed y9c closed 2 years ago

y9c commented 2 years ago

When run hisat-3n-table command on reads like this, segmentation fault (core dumped) is reported.

A00639:857:HK3GMDRXY:2:2116:7943:30561_AGGGGGGGTCGC     0       3       75937579        60      24M1160411N3M   *       0       0       ATCGGGGAGGTGGGCGGGGCCGGGGGT     FFFFFFFFFFFFFFFFFFFFFFFFFFF     AS:i:-54        NH:i:1  XM:i:9  NM:i:9  MD:Z:3A3G0A0A3A1A0A4A0A2A1      YZ:A:-  Yf:i:1  ZS:i:0  XN:i:0  XO:i:0  XG:i:0  XS:A:+

I think the long intron might cause this error. 24M1160411N3M

y9c commented 2 years ago

This line is generated by hisat-3n. It is wired that such a long intron (1160411) is reported in the alignment. But the default setting of max intron length is only 500000.

imzhangyun commented 2 years ago

Hello Chang,

Thank you for using HISAT-3N. How did you run the HISAT-3N for alignment? Did you change the score function or something else? Could you provide your commend line script so we can inspect HISAT-3N or HISAT2.

Thank you, Yun (Leo)

y9c commented 2 years ago

Hi @imzhangyun ,

I use the default setting of hisat2-3n with some extra params, which are

-p 24  --summary-file sample_sum --new-summary -q -U sample_name --base-change C,T
imzhangyun commented 2 years ago

Hello Chang,

I guess the super long intron may caused by the --ss database with index building. I just updated the hisat-3n and hisat-3n-table. Please pull the newest code and make again. You may still get some alignment result with long intron (over 500,000), but hisat-3n-table will not crash any more.

Thanks, Leo

y9c commented 2 years ago

Thank you very much. Will test it.