mitoNGS / MToolBox

A bioinformatics pipeline to analyze mtDNA from NGS data
http://sourceforge.net/projects/mtoolbox/?source=navbar
GNU General Public License v3.0
86 stars 37 forks source link

BAM vs FASTQ as input #89

Closed sremuk closed 4 years ago

sremuk commented 4 years ago

I was wondering what kind of output difference should I see if I use a bam vs fastq as input. I tried using bam and fastq for the same sample and I see more annonation files for the bam output as compared to the fastq. How can I know which one to use. Also, for the bam output, there is no 1 main annotation file that summaries all the haplogroups. How can I extract the right data from this output ? Any suggestions and help

clody23 commented 4 years ago

Hi,

is BAM file including aligned or unaligned reads? If aligned reads are there, that can explain the different results.

The presence of more than one annotation file correlates with the number of best haplogroup predictions generated by MToolBox for that sample. This usually occurs when sample is of poor quality or because only few mitochondrial variants are available for prediction (due to low read depth or low coverage). If you have both fastq and BAM (aligned reads) for the same sample, I would recommend always to start from the raw data (fastq).

Best wishes, Claudia