Open jakalssj3 opened 2 years ago
Hi,
The cpg.pval vector should be raw p-values. The function uses raw p-values to adjust for length bias.
To obtain the SYMBOLs of the genes that represent a particular GO or KEGG ID, we can use the org.Hs.eg.db
package https://bioconductor.org/packages/release/data/annotation/html/org.Hs.eg.db.html.
e.g. if we would like to obtain the SYMBOLs in "GO:0035267", we can use the following code.
library(org.Hs.eg.db)
select(org.Hs.eg.db, "GO:0035267", "SYMBOL", keytype = "GOALL")
if we would like to obtain the SYMBOLs in KEGG ID 04080, we can use the following code.
select(org.Hs.eg.db, "04080", "SYMBOL", keytype = "PATH")
Thanks! I have one more question. What about the CpGs which have more than one overlapping genes (e.g. as in the Illumina manifest for EPIC array)? How are these treated e.g. when preparing the annotation with prepareAnnot function?
We are using IlluminaHumanMethylationEPICanno.ilm10b4.hg19
https://bioconductor.org/packages/release/data/annotation/html/IlluminaHumanMethylationEPICanno.ilm10b4.hg19.html to obtain the map between CpG and gene. In the case that CpGs are mapped to more than one gene, the first gene in IlluminaHumanMethylationEPICanno.ilm10b4.hg19
will be selected.
Another question on cpg p-values - what kind if input should I use when I want to narrow down my analyses to only the hypomethylated probes (and their corresponding genes)?
When running the analyses, e.g. with the methylglm function, the cpg.pval vector should contain raw p-values from calling significant DMPs or the adjusted p-values?
BTW, can the results table contain the SYMBOLs of the genes that represent the given GO or KEGG ID?