biointec / columba

Fast Approximate Pattern Matching using Search Schemes
GNU Affero General Public License v3.0
6 stars 2 forks source link

Empty fasta headers crash the program #7

Closed jnalanko closed 8 months ago

jnalanko commented 8 months ago

I have the following query file:

>
GAAATATTGGTTAAATATTCCATAAACAACAATATAAAACAGAACAATTCACCATGAAAATACAATTCACCTTTTAAAACATAATGTTATAAAAAAACACCATTCAAATTGATGCCAGTC
>
CCTTACCTGTCAATGTTGTTTACGGTTCCTTTCCCCCTTTTGTGCGAGTAAGCAACATGAATATACAAGTCTTAACGACCAATGCCTCTGTCAGCCCAAAATAAGTCGACATTCCTGTGC

This seems to break the fasta parser and crashes the program:

./build/columba -e 4 -ss pigeon UMN_data/all_noheader UMN_data/test2.fasta
Using 32-bits
Welcome to Columba!
Reading UMN_data/all_noheader.sa.bv.1...done 
Reading UMN_data/all_noheader.sa.1..done
Reading UMN_data/all_noheader.txt...done (size: 3040359461)
Reading UMN_data/all_noheader.cct...done
Reading UMN_data/all_noheader.bwt...done (size: 3040359461)
Reading UMN_data/all_noheader.brt...done
Reading UMN_data/all_noheader.rev.brt...done
Populating FM-range table with 10-mers...done.
Reading in reads from UMN_data/test2.fasta
Benchmarking with PIGEON HOLE strategy for max distance 4 with DYNAMIC partitioning and using (OPTIMIZED) EDIT distance 
Switching to in text verification at 5
Progress: 0/0
Results for PIGEON HOLE
Total duration: 0.00s
Average no. nodes: -nan
Total no. Nodes: 0
Average no. unique matches: -nan
Total no. unique matches: 0
Average no. reported matches -nan
Total no. reported matches: 0
Mapped reads: 0
Segmentation fault (core dumped)