bonsai-team / matam

Mapping-Assisted Targeted-Assembly for Metagenomics
GNU Affero General Public License v3.0
19 stars 9 forks source link

MATAM/SGA expects Phred+33 encoded fastq files #89

Closed ppericard closed 4 years ago

ppericard commented 4 years ago

Apparently, SGA needs the input fastq files to be encoded in Phred+33, and crashes otherwise, without MATAM catching the error.

We need to add in the README/manual that fastq files need to be encoded in Phred+33. And we probably should check at the beginning of MATAM that we indeed have the correct encoding as input. This can be done by sampling a few of the reads and checking against the ASCII table for characters specific to either Phred+33 or Phred+64.