Closed talgatomarov closed 4 months ago
Would you have a reproductive example?
Did you check that you are using the last version of ggstat and of broom.helpers package?
I am using ggstats=0.6.0 , survival=3.5-7 broom.helpers=1.15.0.
I've noticed that the issue occurs only when I am using a lot of predictor variables. Below, I generated some dummy data.
library(survival)
library(ggstats)
set.seed(15)
factor_values = c("0", "1", "2-3", "4-5", ">5")
data = as.data.frame(
list(
event=sample(c(0,1), replace=TRUE, size=100),
time=sample(c(5, 10, 12, 50), replace=TRUE, size=100),
factor_var1=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var2=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var3=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var4=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var5=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var6=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var7=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var8=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var9=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values),
factor_var10=factor(sample(factor_values, replace=TRUE, size=100), levels=factor_values)
)
)
model = survival::coxph(
formula=Surv(time, event) ~ .,
data=data
)
options(repr.plot.width=12, repr.plot.height=12)
ggcoef_table(
model,
exponentiate=TRUE
)
This is my output when I include 10 factor variables.
This is my output when I include 9 factor variables (not included in code)
OK. I better understand your issue. The problem is coming from the fact that several variables share the same levels.
To better understand your initial issue, you could call ggcoef_model()
with return_data = TRUE
. You will see the dataset used by ggcoef_model()
and ggcoef_table()
to generate the plot. By default, the variable mapped to y
axis is "label"
and we use "var_label"
to facet the plot by variable. "label"
is transformed into a factor with forcats::fct_inorder()
. When some terms are used by one variable but not by another variable, then the order is sometimes not preserved.
Thank you for your efforts on this package.
I am using ggcoef_table to visualize coefficients from a survival::coxph model. It works great. However, I've noticed that categorical terms are sorted based on their string values. For example, when I specify a factor variable with levels=c("0", "1", ">=2")), the terms are displayed in this order: "1", "0", ">=2". Is there a way to enforce the same order as in the factor levels?
I found a temporary workaround. Specifying categorical_terms_pattern="level={level_rank}; {level}" puts them in correct order. However, it does not seem like a clean solution.
Thank you.