Open neolithlee opened 2 weeks ago
Such spikes in M-bias plots (of sometimes also GC content plots etc) are typically caused by individual sequences that are highly overrepresented, and have a certain methylation state. You could try to identify the particular sequence via various means, the easiest probably being looking for isolated loci with a very high number of mapping reads. You could also try to see how many calls there are at this position (not sure you can do this in the MultiQC report, but you could look at the equivalent Bismark_report.html). In all likelihood such minor blips won't affect your downstream analysis overall, but are likely some very localised effects (just my gut feeling at this point).
Thanks for your reply.
As you said, some of the spikes seem to be related to the number of reads. In the case of the largest spike, the average Qscore appears to be lower than other areas, so I will check whether there is an experimental problem.
These things something seem present themselves problematic in more than one of the FastQC modules. There could for example have been a technical issue with the flowcell (which you might see in the per-tile plot), such as an air bubble, or a higher call of N at the position, or a very high number of a specific call (e.g. G
when the signal from the dyes wasn't high enough), or indeed it there is a very high prevalence of a certain base because of an overrepresentation of a certain (repetitve?) sequence that will in turn down-adjust quality scores and the like. But given that it manifests itself in the M-bias plot, it has to come from a sequence that is mappable, which already narrows it down substantially. Happy sleuthing!
The data I used was processed by fastqc and Trim_galore. And it is processed by bismark, deduplicate_bismark and bismark_methylation_extractor as specified in the manual.
As can be seen in the M-bias plot(from multiQC), the methylation level of read 1 appears to be stable. However, methylated read 2 produces some peaks. May I ask why this variant appears in the Reading 2 plot?