Open hrhotz opened 5 years ago
@heylf do you have time for this?
Hi, Indeed, we don't need to remove duplicates in Genrich. In fact Genrich can do a lot of filtering by itself that we did not used in this tutorial but if you want to use another peak caller like macs2, you will need to keep this filtering steps (including the MarkDuplicates step). I would like to modify this tutorial to propose both approaches:
That's great to hear @lldelisle
I am struggling to understand, why the "Filter Option: Remove PCR duplicates” in the Genrich tool is set to "Yes" after running the MarkDuplicates tool with “If true do not write duplicates to the output file instead of writing them with appropriate flags set” set to "Yes". The list of PCR duplicates produced by Genrich is empty.
When I run Genrich with "Filter Option: Remove PCR duplicates” set to "No", I get the same result for the bedgraph pile up and the encode peak files.
Also, when I skip the MarkDuplicates step and run Genrich with "Filter Option: Remove PCR duplicates” set to "Yes", I get the same result. And this time with a list of 3549 PCR duplicates