jsh58 / DMRfinder

Identifying differentially methylated regions from MethylC-seq (bisulfite-sequencing) data
MIT License
26 stars 8 forks source link

Input data for methylation count extraction. #18

Open desmodus1984 opened 5 months ago

desmodus1984 commented 5 months ago

Hi,

I am interested in finding differentially methylated regions, and I have processed my PE-reads with Bismark. I wanted to try DMRfinder to identify DMR, but I wanted to ask a question before moving on forward. My question is: I followed the recommended pipeline from Bismark, and I did the alignment, and then I deduplicated the bam files. So, I am not sure, if I should use the "raw" bam/mapping file or the bam/deduplicated bam file for extracting the methylation counts.

Thanks;

jsh58 commented 4 months ago

This is a matter of judgment. If you think a lot of your reads are PCR duplicates, then you should remove them. But if you have high enough coverage that reads might falsely be identified as duplicates, then you shouldn't.

I will add, from the README:

Following alignment, some researchers choose to remove reads that may be PCR duplicates. We do not recommend using Bismark's deduplication script for this purpose; it simply keeps the first read at a given position in the alignment file and eliminates the rest, regardless of the reads' sequences or methylation information.