brentp / bwa-meth

fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome
https://arxiv.org/abs/1401.1129
MIT License
141 stars 54 forks source link

Does bwa-meth aligned bam files need to be duplicates marked for downstream methylDackel work properly? #54

Closed crazyhottommy closed 6 years ago

crazyhottommy commented 6 years ago

Hi Brent,

I am using following to map single end RRBS-seq data with bwa-meth

python {config[bwmeth_path]} {params.custom} --reference {config[ref_fa]} {input} \
                    --read-group '{params.rg}' 2> {log.bwa} \
                    | samtools sort -m 2G -@ 5 -T {output[0]}.tmp -o {output[0]}

samtools index {output[0]}

for RRBS, one expects to see many duplicates at the same CpG (exact restriction enzyme cut site). MethylDakel has an option --keepDups to remain the duplicates. Do I need to mark the bam files from bwa-meth and then go with MethylDakel?

Thanks! Tommy

brentp commented 6 years ago

aye. bwa-meth does not mark duplicates so you'll have to do that with another tool.