TransBioInfoLab / coMethDMR

Detect Regions of Concurrent Differential Methylation
https://transbioinfolab.github.io/coMethDMR/
7 stars 2 forks source link

OrderCpGsByLocation Fails for "bad" probe IDs #4

Open gabrielodom opened 3 years ago

gabrielodom commented 3 years ago

When re-creating the data sets in inst/extdata/, I ran into the following bug. There are 19 probes included in IlluminaHumanMethylationEPICanno.ilm10b2.hg19::Other which are not included in the corresponding sesameData manifest GRanges. When this happens, the OrderCpGsByLocation() function fails with the error "Error: subscript contains invalid names". I am adding a check to remove probe IDs which are not contained in the manifest with an error if none of the probes are contained in that manifest.

gabrielodom commented 3 years ago

For documentation, the bad probes are:

c("cg01420942", "cg04099150", "cg04726360", "cg08777477", "cg10784100", 
"cg12667768", "cg15353533", "cg15719326", "cg18918478", "cg19273712", 
"cg21479731", "cg21701692", "cg21732824", "cg22105103", "cg22802935", 
"cg25007799", "cg25258605", "cg26657539", "cg26993204")

Also, this function is missing a match.arg() call for the output argument. That's a quick fix though.

gabrielodom commented 3 years ago

@tiagochst and @lxw391, are you ok with these changes? Should I add a warning if any of the probes are removed?

lxw391 commented 3 years ago

@gabrielodom it's ok to remove those CpGs, they are probably the ones masked by the sesame R package. You can read "How/Why Probes Are Masked?" at https://bioconductor.org/packages/devel/bioc/vignettes/sesame/inst/doc/sesame.html and this paper to understand more about the masking process https://academic.oup.com/nar/article/45/4/e22/2290930 . Check sesameDataGet('HM450.probeInfo')$mask to see the listed probes are included

gabrielodom commented 3 years ago

Thank you Lily! @fveitz, can you take a look at the masked probes and see if they match the bad probes we found?

gabrielodom commented 2 years ago

@fveitz, is this fixed?

gabrielodom commented 2 years ago

It's not a masking issue. Of the bad probes, only "cg15353533" "cg21732824" "cg22105103" were in the masked set of probes.

tiagochst commented 2 years ago

@gabrielodom it is not a annotation version problem ?

Screen Shot 2022-01-21 at 10 05 15 AM
tiagochst commented 2 years ago
Screen Shot 2022-01-21 at 10 06 17 AM
tiagochst commented 2 years ago

Maybe changing IlluminaHumanMethylationEPICanno.ilm10b2.hg19 to IlluminaHumanMethylationEPICanno.ilm10b4.hg19 would fix the problem ?

gabrielodom commented 2 years ago

Great idea! We have IlluminaHumanMethylationEPICanno.ilm10b2.hg19 in our Suggests field right now. Is there an updated analogue for 450k as well?

tiagochst commented 2 years ago

I think for HM450 there is only one version.