smithlabcode / abismal

Abismal is a mapper of FASTQ bisulfite-converted short reads (between 50 and 1000 bases) to a FASTA reference genome.
GNU General Public License v3.0
12 stars 5 forks source link

Segmentation fault (core dumped) #5

Closed Hannah1746 closed 2 years ago

Hannah1746 commented 3 years ago

I am trying to use methpipe and can't seem to get it to work for me.

I first run: abismalidx FINAL_MX_HiC_50CHRs_rn.fa MX.abismalidx

This produces an index file that is 2.7G from my 2.3G genome

Then I run: abismal -i MX.abismalidx -o MX_gill.sam MX_gill_output.fastq but I get this core dumped error so I tried: abismal -g FINAL_MX_HiC_50CHRs_rn.fa -o MX_gill.sam MX_gill.fastq I get the same error on this too.

I am using Nanopore data so my fastq are singled end and are all cated together.

guilhermesena1 commented 2 years ago

Hello,

Thank you so much for reporting this issue. Abismal was developed as a short read mapper for bisulfite sequencing reads, so it is not the appropriate tool for nanopore reads.

That said, the program should not be returning a segfault on a long read, but rather it should fail gracefully stating that the read is too long to be mapped. I tried on my end mapping some simulated reads of large size but couldn't reproduce the error. I suspect you got a segfault because you got a hit near the end of the genome and the alignment spanned way past beyond the genome end. This can be handled easily on short reads but less so on long reads.

I will be pushing a "fix" that requires a maximum read length and stops mapping if any read goes above this cut-off. Apologies for the inconvenience!

guilhermesena1 commented 2 years ago

Hello,

Just following up on this. We extended the maximum read length for abismal to ~1 million reads on commit 9c93354 (currently abismal can only safely map reads of size > 1024) . This requires rebuilding the index. If any read of length >1 million is passed, the program will fail stating that a read that is too long is being passed.

This will possibly solve your issue if your read lengths are <1m. If you are still having problems, we'd appreciate if you could could provide a small test case in which we can reproduce the issue on our end. If you're not comfortable publicly sharing data you can always reach out by e-mail (desenabr@usc.edu)

guilhermesena1 commented 2 years ago

closing this issue for now (this was incorporated on 2.0.0) but feel free to reopen if this problem resurfaces.