NikNakk / forestmodel

41 stars 13 forks source link

Overall P-Value for Variables with More than Two Levels #31

Closed tristanhayes closed 3 years ago

tristanhayes commented 3 years ago

Authors,

First off, thank you for this amazing package. I wanted to ask if it would be possible for variables with more than two levels to show the overall wald p-value. Before building multivariate models, we normally show in the report a list of univariate associations. Your package works great in cases where the variable is continuous, or only has two levels. However, if we take something like race, let's imagine White, Black, Other- this package shows the association for each individual level, but does not show the overall p-value for this association. Possibly this P-value could be listed next to "reference"?

Apologies if this is a distraction and not worth your time. I found your package extremely helpful.

roman2023 commented 3 years ago

I am wondering if you @tristanhayes found a solution for this. I am having the same difficulty. The package is no doubt amazing @NikNakk. I have certain variables with more than 10 factors where I need to show the Wald test/likelihod ratio test. I looked though the documentation but could not find a work around.

ShixiangWang commented 3 years ago

When you are running a multiple variable regression, the model cannot report overall p value for a specific variable. It only works for the whole model.

> library("survival")
> library("dplyr")
> pretty_lung <- lung %>%
+   transmute(time,
+             status,
+             Age = age,
+             Sex = factor(sex, labels = c("Male", "Female")),
+             ECOG = factor(lung$ph.ecog),
+             `Meal Cal` = meal.cal
+   )
> table(lung$ph.ecog)

  0   1   2   3 
 63 113  50   1 
> coxph(Surv(time, status) ~ ., pretty_lung)
Call:
coxph(formula = Surv(time, status) ~ ., data = pretty_lung)

                 coef  exp(coef)   se(coef)      z       p
Age         7.848e-03  1.008e+00  1.080e-02  0.727 0.46727
SexFemale  -5.050e-01  6.035e-01  1.913e-01 -2.640 0.00829
ECOG1       2.756e-01  1.317e+00  2.221e-01  1.241 0.21462
ECOG2       7.852e-01  2.193e+00  2.543e-01  3.087 0.00202
ECOG3       1.792e+00  6.004e+00  1.036e+00  1.731 0.08348
`Meal Cal` -6.031e-05  9.999e-01  2.300e-04 -0.262 0.79313

Likelihood ratio test=21.56  on 6 df, p=0.001453
n= 180, number of events= 133 
   (48 observations deleted due to missingness)
roman2023 commented 3 years ago

Hi @ShixiangWang

We are referring to a list of multiple "univariate" models. See my code below:

example using transplant data from survival package

make new event-variable: death or no death

to have dichot outcome

transplant$death <- transplant$event=="death"

making formulas

univ_formulas <- sapply(c("age","sex","abo"),function(x)as.formula(paste('Surv(futime,death)~',x)) )

making a list of models

univ_models <- lapply(univ_formulas, function(x){coxph(x,data=transplant)})

pass the model list to forestmodel

univ_results <- forest_model (model_list=univ_models, merge_models=T)

What we get here is a forest plot showing p-values for multiple "univariate analysis" for a list of covariates ("age", "sex", "abo").

What we would want is another column that displays "global p-value from either Wald/Likelihood ratio/ score test" for variables with more than two levels. Please see below the output from the above code with annotations on what we would like to see.

Rplot

The global p-value that I am talking about comes from the summary(cox model) Example below: univ_abo<- Surv(time=transplant$futime, transplant$event) surv_abo <- survfit(univ_abo ~ transplant$abo)
summary(surv_abo) surv_ab <- (coxph(Surv(time=transplant$futime, event=transplant$death)~transplant$abo, data=transplant)) summary(surv_ab)

summary coxph

Thanks.

tristanhayes commented 3 years ago

Exactly!

On Sat, Aug 7, 2021, 10:43 roman2023 @.***> wrote:

Hi @ShixiangWang https://github.com/ShixiangWang

We are referring to a list of multiple "univariate" models. See my code below:

example using transplant data from survival package

make new event-variable: death or no death

to have dichot outcome

transplant$death <- transplant$event=="death"

making formulas

univ_formulas <- sapply(c("age","sex","abo"),function(x)as.formula(paste('Surv(futime,death)~',x)) )

making a list of models

univ_models <- lapply(univ_formulas, function(x){coxph(x,data=transplant)})

pass the model list to forestmodel

univ_results <- forest_model (model_list=univ_models, merge_models=T)

What we get here is a forest plot showing p-values for multiple "univariate analysis" for a list of covariates ("age", "sex", "abo").

What we would want is another column that displays "global p-value from either Wald/Likelihood ratio/ score test" for variables with more than two levels. Please see below the output from the above code with annotations on what we would like to see.

[image: Rplot] https://user-images.githubusercontent.com/62126316/128605158-941d1620-377a-445f-ba9f-4141cbe9db90.png

The global p-value that I am talking about comes from the summary(cox model) Example below: univ_abo<- Surv(time=transplant$futime, transplant$event) surv_abo <- survfit(univ_abo ~ transplant$abo) summary(surv_abo) surv_ab <- (coxph(Surv(time=transplant$futime, event=transplant$death)~transplant$abo, data=transplant)) summary(surv_ab)

[image: summary coxph] https://user-images.githubusercontent.com/62126316/128605574-d06c451b-58b8-471d-b6a5-ef37a62c84c4.png

Thanks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NikNakk/forestmodel/issues/31#issuecomment-894670449, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAWS23R2Q6FN7CZB3AX3L3T3VIAZANCNFSM43AYRCPA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

ShixiangWang commented 3 years ago

Hi guys, the global p value featuer has been implemented for Cox model. You can install the latest version from GitHub with remotes package, then try the following examples.

# install
remotes::install_github("ShixiangWang/forestmodel")

# example
library("survival")
library("dplyr")
library("forestmodel")

pretty_lung <- lung %>%
  transmute(time,
            status,
            Age = age,
            Sex = factor(sex, labels = c("Male", "Female")),
            ECOG = factor(lung$ph.ecog),
            `Meal Cal` = meal.cal
  )

print(forest_model(coxph(Surv(time, status) ~ ., pretty_lung), show_global_p = "bottom"))
print(forest_model(coxph(Surv(time, status) ~ ., pretty_lung), show_global_p = "aside"))

print(forest_model(coxph(Surv(time, status) ~ ECOG, pretty_lung), show_global_p = "aside"))
roman2023 commented 3 years ago

Thanks a ton!

On Sun, Aug 8, 2021 at 2:53 AM Shixiang Wang @.***> wrote:

Hi guys, the global p value featuer has been implemented for Cox model. You can install the latest version from GitHub with remotes package, then try the following examples.

install

remotes::install_github("ShixiangWang/forestmodel")

example

library("survival") library("dplyr") library("forestmodel")

pretty_lung <- lung %>% transmute(time, status, Age = age, Sex = factor(sex, labels = c("Male", "Female")), ECOG = factor(lung$ph.ecog), Meal Cal = meal.cal )

print(forest_model(coxph(Surv(time, status) ~ ., pretty_lung), show_global_p = "bottom")) print(forest_model(coxph(Surv(time, status) ~ ., pretty_lung), show_global_p = "aside"))

print(forest_model(coxph(Surv(time, status) ~ ECOG, pretty_lung), show_global_p = "aside"))

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NikNakk/forestmodel/issues/31#issuecomment-894765227, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOZ7R3EKRSIO3MH5NCG7LUDT3ZAYDANCNFSM43AYRCPA .

--

Dr. Aniket Bankar MBBS, MD, DM (Hematology) Hematologist and Bone Marrow Transplant Physician