The latent variables learned by PLIER can capture variability related to biological signals or technical noise. We've framed the former as latent variables that are significantly associated with pathways that we supplied during model training. The remaining latent variables may or may not be representative of a coherent biological process.
We didn't supply any of the models with MSigDB oncogenic pathways (they come with PLIER data("oncogenicPathways"), so they've been held out and we can essentially think of these as "novel to the model." We can check if the LV loadings align with these pathways.
Here I add:
A new custom function CalculateHoldoutAUC in util/plier_util.R that is based off of PLIER:::crossVal; this is not specific to oncogenic pathways and can be used with any gene sets that are in the correct prior information format.
A notebook examining the "pathway coverage" for the oncogenic pathways in the recount2 model (27-oncogenic_pathway_recount2_model.Rmd).
Results: ~76% of the pathways are associated (using that same FDR < 0.05 cutoff), but I'll need to repeat this with a variety of models (see #39).
Related issue: #38
The latent variables learned by PLIER can capture variability related to biological signals or technical noise. We've framed the former as latent variables that are significantly associated with pathways that we supplied during model training. The remaining latent variables may or may not be representative of a coherent biological process.
We didn't supply any of the models with MSigDB oncogenic pathways (they come with PLIER
data("oncogenicPathways")
, so they've been held out and we can essentially think of these as "novel to the model." We can check if the LV loadings align with these pathways.Here I add:
CalculateHoldoutAUC
inutil/plier_util.R
that is based off ofPLIER:::crossVal
; this is not specific to oncogenic pathways and can be used with any gene sets that are in the correct prior information format.27-oncogenic_pathway_recount2_model.Rmd
).Results: ~76% of the pathways are associated (using that same FDR < 0.05 cutoff), but I'll need to repeat this with a variety of models (see #39).
HTML notebook for easy viewing: 27-oncogenic_pathway_recount2_model.nb.zip