tecangenomics / nudup

NuDup -- Marks/removes duplicate molecules based on the molecular tagging technology used in Tecan products.
http://www.tecangenomics.com
GNU Lesser General Public License v3.0
14 stars 9 forks source link

Which output file from nudup.py to use with bismark_methylation_extractor? #9

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi Not sure how dumb of a question this is, but here it goes... I have some RRBS data generated with the Ovation kit which I am currently analyzing. I am currently on the step of running nudup.py on my files. I am using bismark for mapping. I noticed nudup.py generated two different .bam files (one dedup.bam and one markdup.bam). So I have two .bam files for each of my samples (sample1.sorted.dedup.bam and sample1.sorted.markdup.bam, and so on) I am trying to run the bismark_methylation_extractor, but I am not sure which .bam file to indicate as an input. Should I indicate both .bam files and run it separately by sample? Or should I just indicate *.bam and let it run in all .bam files present in the folder with all my samples? Not really sure how to proceed... Thank you

shuelga commented 7 years ago

One output BAM file has the duplicates still in the file but marked with a duplicate flag, and the other BAM file has the duplicates completely removed from the file. Most tools ignore reads that are marked/flagged as duplicates, so in most cases the files will be treated the same. For more specific analysis pipeline questions, contact NuGEN Technologies Technical Support techserv@nugen.com.