Open qingjian1991 opened 2 years ago
For a given imputable marker gene, the procedure to determine which samples are expressed and which ones are not, is very clearly explained in the Genome Biology paper. This is basic 2-state mixture modelling stuff. For the two databases SCM2 and RMAP, these analyses are precomputed for each imputable gene, as there is no point recalculating these every single time you want to use EpiSCORE, i.e these analyses are completely separate from the actual imputation of the DNAm reference matrix.
For a given imputable marker gene, the procedure to determine which samples are expressed and which ones are not, is very clearly explained in the Genome Biology paper. This is basic 2-state mixture modelling stuff. For the two databases SCM2 and RMAP, these analyses are precomputed for each imputable gene, as there is no point recalculating these every single time you want to use EpiSCORE, i.e these analyses are completely separate from the actual imputation of the DNAm reference matrix.
Thanks, I just want to learn how to do the bayesian inference according to the two-components distributions. I have know how to do that according to this site: https://www.cs.toronto.edu/~rgrosse/csc321/mixture_models.pdf. By the way, my code provides the right soultion in R.
Hi @aet21 :
In the third step: "Development and application of probabilistic imputation model" of your GB paper, the DNAm levels of unexpressed genes are estimated by a 2-state mixture of gamma distributions of all expressed genes. Users may want to learn how to build this model from the expression matrices of SCM2 and RMAP. I found that the DNAm levels, the gene expression levels, and the final posterior probabilities are stored in the
data("dbSCM2")
anddata("dbRMAP")
. The final posterior probabilities are inferred from the gene expression matrix based on the 2-state mixture of gamma distributions. Could you add some example codes to describe how to infer the posterior probabilities? Your open code will provide valuable learning resources for beginners in bioinformatics.Below is some code, I have tried.