Closed caalo closed 6 years ago
Wow this looks really weird, and more worryingly: it looks quite different from what R1 is doing. It is almost certain that combing Read 1 and Read in this case will introduce a LOT of variability almost independently of which kind of threshold you are setting. Which kind of kit was that? Maybe it would be worth getting back to the manufacturer and ask them what they think might be going on?
Hi Felix,
Upon a bit more digging, this phenomena is only specific to cell-free DNA (cfDNA): our gDNA samples do not exhibit this pattern. All of our cfDNA and gDNA samples were treated with Zymo EZ Methylation lightning kit. I also have been looking at cfDNA WGBS samples from other publications that used Qiagen Epitect kit, and they exhibit the same phenomena. Ours is labeled "in-house", whereas the other publication is labeled as "external":
And here is our gDNA:
This observation seems to be specific to cfDNA regardless of kit used. However, I don't see any reason why R1 and R2 should have noticeably different methylation patterns, especially if there is 50/50 strand balance. Would be interested to hear what you think.
I just talked to Simon about this and he had an interesting idea (I am not sure how the kits work in detail but I'll try to explain it anyway). Basically, since R1 doesn't show this phenomenon (or only a slight drop towards the 3' end), could there be a directional degradation from the 3' end in cell free extract only? Is there some step in the cf method that reconstitutes partially degraded material at some stage (the red dotted line below)? If this fill-in would be performed with unmehtylated C it would explain why you are seeing this selectively towards the 3' ends of R1, or the start of R2. Or maybe something along those lines...
In practical terms this might then look like the standard R2 fill in bias at the start, just spread out over a longer stretch. I guess the options might the be clipping R2 by 25bp (or ignoring these positions), or ignoring R2 altogether as it will almost certainly add a lot of noise to your R1 data. I am not so sure about the 3' end methylation dip in R2, but looking at the total number of calls at these positions they won't make much of a difference (the calls are much lower because of overlap and adapter removal).
Hi Felix,
Another question for you -- I'm looking at M-bias for my samples (8-10x WGBS) and I'm noticing that there is definitely a dropoff of methylation at the start of the read, but it goes back up continuously rather than a sharp increase, over 20-25bp. This looks somewhat different than your post illustrating M-bias. Have you seen anything like this before and do you have any recommendation on read clipping? I'm looking at read 2 here, but also included the read 1 plot.
Thanks, Chris