Closed ycl6 closed 8 years ago
The count in methcounts result is not necessarily the true number of reads overlapping that site. If in the read there is a mismatch right at the cytosine site, then that read will not be counted. For example, in the genome there is this C site, TTCAA, but the read is TTAAA. This A readout is neither the original C nor a bisulfite converted T; therefore it will be considered as a mismatch.
So I think you should always see the number reported by methcounts lower than or equal to the number you obtain from intersectBed. If that's not your case, please let me know.
Best regards, Meng
On Mon, Nov 23, 2015 at 6:36 PM, I-Hsuan Lin notifications@github.com wrote:
Hi,
I used intersectBed to check the number of reads overlapping a particular site. The no. of matched reads in the .mr.sorted file generate by duplicate-remover (used as input for methcounts) and the no. of reads overlapping the site recorded in the .meth file generate by methcounts is different.
I thought the 2 numbers should match. Did I misinterpret the meaning of these 2 output files?
— Reply to this email directly or view it on GitHub https://github.com/smithlabcode/methpipe/issues/89.
Thanks @mengzhou, the number indeed add-up after taking mismatches into account.
Hi,
I used intersectBed to check the number of reads overlapping a particular site. The no. of matched reads in the .mr.sorted file generate by duplicate-remover (used as input for methcounts) and the no. of reads overlapping the site recorded in the .meth file generate by methcounts is different.
I thought the 2 numbers should match. Did I misinterpret the meaning of these 2 output files?