Closed cokelaer closed 6 years ago
I'm using fasta files since 2004 and have never heard about fasta headers beginning with "@". I don't find any mentions of this in the internet. It seems that SeqIO.parse silently skips records having this kind of header.
My mistake, this explains my wrong results regarding the biopython performances. the simulator is fixed and I believe we have the same benchmark performances . I've udapted the documentation with a nice benchmark for the fastq2fasta.
@blaiseli
You mentionned
"Regarding the md5 sum for the test fasta file, we shouldn't have a fasta file where headers are introduced by "@" instead of ">".
For now, interestingly, biopython, seqtk and gatb handle this old format. This is not a standard format but may be provided by old sequencer. I would suggest to keep it for now and add same data set with @ replaced by >