AESPCA Non-parametric p-Values Calculation Method

gabrielodom commented 5 years ago

From @lxw391 in Issue #50 : "Please update function using anova(true_mod, test = "LRT"). Also, please change AIC to anova function instead to perform LRT test."

Currently, this is the process for calculating non-parametric p-values:

Estimate the true model (CoxPH, GLM, or LM, depending on the response)
Calculate the AIC of the true model
For the specified number of replicates, i. sample from the response vector / matrix (see SampleSurv(), SampleCateg(), and SampleReg()) ii. estimate the model based on the false response iii. calculate the AIC of the false model
Return the proportion of AIC values such that the true AIC was smaller than the false AIC. This is the non-parametric p-value.

This was the method we agreed to in the Spring. Our reasoning was that the AIC statistic has a defined function (AIC()) for all three models, a decision of code simplicity.

Based on Lily's request, how should we proceed? Is there a similar function to extract the LRT from all three models? That's what I'm looking into now.

gabrielodom commented 5 years ago

The AIC is defined as 2 p - 2 log-likelihood. For these scenarios, p is fixed, ensuring that the AIC and log-likelihoods are proportional to each other. I still don't understand why we need to change the code.

Nevertheless, here are my notes on calculating the LRT. For a model output from the coxph() function, the code to return the log-likelihood is logLik(mod)[1]. This returns the log-likelihood value for glm() and lm() as well. We could therefore replace all calls to AIC() with calls to function(x) logLik(x)[1].

lxw391 commented 5 years ago

I think changing to logLik(mod) would be good, to avoid potential confusions later on. We can discuss more at the next meeting.

gabrielodom commented 5 years ago

Per our discussion, leave the AIC calculation, but add a comment that the AIC and log-likelihood are proportional because the number of parameters is constant.

gabrielodom / pathwayPCA

AESPCA Non-parametric p-Values Calculation Method #57