GATB / bcalm

compacted de Bruijn graph construction in low memory
MIT License
99 stars 20 forks source link

Add better input parsing #63

Closed pashadag closed 1 year ago

pashadag commented 4 years ago

BCALM2 does not always crash elegantly when the input fasta. For example,

>cat blin.fa
adfad
> bcalm -in blin.fa -kmer-size 3
BCALM 2, git commit e9ba83c
setting storage type to hdf5
HDF5-DIAG: Error detected in HDF5 (1.8.18) thread 0:
  #000: /home/pzm11/research/software/bcalm/gatb-core/gatb-core/thirdparty/hdf5/src/H5F.c line 604 in H5Fopen(): unable to open file
    major: File accessibilty
    minor: Unable to open file
....

One can make use of https://github.com/linsalrob/fasta_validator to improve this by adding fasta validation code into the bcalm distribution. It could automatically check the first 10000 lines as a first step.

rchikhi commented 1 year ago

duplicate #64