FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
366 stars 101 forks source link

mbias plot methylation rate fluctuations #664

Closed ertiaM closed 3 months ago

ertiaM commented 3 months ago

Bismark M-bias Read 1 (1) Bismark M-bias Read 2 (1)

The data I used has been processed by fastqc and Trimmomatic. Then the data has been processed by bismark, deduplicate_bismark and bismark_methylation_extractor as the manual told. As the mbias plots show, the methylation level in read 1 seems to be stable. However, the methylation read 2 is very volatile. May I ask the possible reasons for this fluctuations appears in this read 2 plot.

FelixKrueger commented 3 months ago

The steep drop at the beginning of Read 2 is almost certainly a consequence of a bias brought in by the end-repair of fragments, as discussed in this QCFail article. Ignoring the first 2-4bp will alleviate this. I don't really know the reason why the levels are a meandering around the 80% mark in Read 2, and only reach the Read 1 levels ~100bp into the read - there clearly seems to be some technical reason. Maybe it has to do the sequencing run as such?

ertiaM commented 3 months ago

Thanks for your explanation! The low quailty bases have been detected at the end of read 2, while we cut them out using Trimmomatic. Maybe that is the reason why the mbias plot of read 2 was unsteady. May I ask for any suggestions? And based on this situation, can I use this bismark_methylation_extraction result?

FelixKrueger commented 3 months ago

I don't think this has anything to do with the 3' trimming of low qualities, especially since the more variable part is at the 5' end. I personally would re-run the methylation extraction while using --ignore_r2 4 or so (mouse over the curve to see when the levels are back to ~70-80%, and then go ahead and use the results.

You will also notice that the total number of calls in Read 2 goes down with increased length as a function of the overlap detection and removal, so all in all Read 1 will get some more weight anyways.

ertiaM commented 3 months ago

Thanks for your generous advice, i would try it.