reese3928 / methylGSA

A Bioconductor package and shiny app for DNA methylation data length bias adjustment in gene set testing
https://bioconductor.org/packages/release/bioc/html/methylGSA.html
11 stars 2 forks source link

cpg.pval - which p-values to use? #5

Open jakalssj3 opened 2 years ago

jakalssj3 commented 2 years ago

When running the analyses, e.g. with the methylglm function, the cpg.pval vector should contain raw p-values from calling significant DMPs or the adjusted p-values?

BTW, can the results table contain the SYMBOLs of the genes that represent the given GO or KEGG ID?

reese3928 commented 2 years ago

Hi,

The cpg.pval vector should be raw p-values. The function uses raw p-values to adjust for length bias.

To obtain the SYMBOLs of the genes that represent a particular GO or KEGG ID, we can use the org.Hs.eg.db package https://bioconductor.org/packages/release/data/annotation/html/org.Hs.eg.db.html. e.g. if we would like to obtain the SYMBOLs in "GO:0035267", we can use the following code.

library(org.Hs.eg.db)
select(org.Hs.eg.db, "GO:0035267", "SYMBOL", keytype = "GOALL")

if we would like to obtain the SYMBOLs in KEGG ID 04080, we can use the following code.

select(org.Hs.eg.db, "04080", "SYMBOL", keytype = "PATH")
jakalssj3 commented 2 years ago

Thanks! I have one more question. What about the CpGs which have more than one overlapping genes (e.g. as in the Illumina manifest for EPIC array)? How are these treated e.g. when preparing the annotation with prepareAnnot function?

reese3928 commented 2 years ago

We are using IlluminaHumanMethylationEPICanno.ilm10b4.hg19 https://bioconductor.org/packages/release/data/annotation/html/IlluminaHumanMethylationEPICanno.ilm10b4.hg19.html to obtain the map between CpG and gene. In the case that CpGs are mapped to more than one gene, the first gene in IlluminaHumanMethylationEPICanno.ilm10b4.hg19 will be selected.

jakalssj3 commented 2 years ago

Another question on cpg p-values - what kind if input should I use when I want to narrow down my analyses to only the hypomethylated probes (and their corresponding genes)?