zwdzwd / sesame

🍪 SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Other
63 stars 33 forks source link

Confusion regarding the recommended masks for EPICv2 #108

Open JiaweiDai-create opened 1 year ago

JiaweiDai-create commented 1 year ago

Hi,

Thank you for developing SeSAMe. SeSAMe utilizes the recommended masks as follows:

  1. M_1baseSwitchSNPcommon_5pt: Mapped Infinium-I probes with SNP (Minor allele frequency (MAF)>=5%) hitting the extension base and changing the color channel.
  2. M_2extBase_SNPcommon_5pt: Mapped Infinium-II probes with SNP (MAF>=5%) hitting the extension base.
  3. M_SNPcommon_5pt: Mapped probes having at least a common SNP (MAF>=5%) within 5bp from 3'-extension.
  4. M_mapping: Unmapped probes, or probes having too low mapping quality (alignment score under 35, either probe for Infinium-I) or Infinium-I probe allele A and B mapped to different locations.
  5. M_nonuniq: Mapped probes but with mapping quality smaller than 10, either probe for Infinium-I.

I have two questions:

  1. Why are not deletions, hyperpolymorphic regions, as identified in the literature (SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions), included in the masks?
  2. What the difference between M_mapping and M_nonuniq? Aren't probes with a mapping quality smaller than 10 a subset of probes with too low mapping quality (alignment score under 35)?

Thank you!