I have a few nugen processed paired end human samples. I processed the bam files from HiSeq using the following pipeline. Fastqc>Trim>Tophat>Sort>NugenDedup.py>Htseqcounts. But I am getting 0 counts for a couple of my samples after I ran nugen Deup scirpt..Without dedup I have more counts..but is it possible to get 0 counts..isn't dedup supposed to remove only the PCR duplicates?

tecangenomics / nudup

NuDup -- Marks/removes duplicate molecules based on the molecular tagging technology used in Tecan products.

http://www.tecangenomics.com

GNU Lesser General Public License v3.0

14 stars 9 forks source link

I have a few nugen processed paired end human samples. I processed the bam files from HiSeq using the following pipeline. Fastqc>Trim>Tophat>Sort>NugenDedup.py>Htseqcounts. But I am getting 0 counts for a couple of my samples after I ran nugen Deup scirpt..Without dedup I have more counts..but is it possible to get 0 counts..isn't dedup supposed to remove only the PCR duplicates? #15

Closed pbpayal closed 6 years ago

shuelga commented 6 years ago

Hello @pbpayal --

Zero reads should not happen, and suggests something may be wrong with the deduplication process. There seems to be a compatibility issue with the way that Tophat outputs reads and we have yet solve it - if you can use the STAR aligner instead, you should be able to get around this issue.

Thanks!

pbpayal commented 6 years ago

Thanks. Let me try to run the samples using STAR.