Closed biometrician closed 1 year ago
As a suggestion, I would ask Gregor to program this issue.
Should it become one type.plot in the plot.abe function?
Yes he can easily do this, I would prefer this to be a new type.plot in plot.abe. Be carful with "Wallisch2021" since there the results need to be based on resampling. Should work automatically when basing the plot on the summary function.
Gregor, here is my code for the stability paths. The plots are not optimized regarding how they look.
# for one tau
set.seed(4624512)
alphas <- c(0.05, 0.1, 0.157, 0.2, 0.25, 0.5)
stability_boot_abe_path <-
abe.resampling(global_model,
data = bodyfat,
include = c("abdomen", "height"),
criterion ="alpha", alpha = alphas,
tau = 0.05, exp.beta = TRUE,
type.resampling = "bootstrap",
num.resamples = 1000)
var_rel_freqABE <- data.frame(summary(stability_boot_abe_path)$var.rel.frequencies)
var_rel_freqABE[,-1] * 100
# stability path for VIF
data_longABE <- gather(var_rel_freqABE[, -1], variable, rel.freq, factor_key = TRUE)
data_longABE$x <- rep(alphas, ncol(var_rel_freqABE)-1)
#data_long$rel.freq <- data_long$rel.freq * 100
qplot(x, rel.freq, data = data_longABE, geom = c("path"), ylab = "Inclusion frequency",
xlab = "alpha", colour = variable) +
theme_bw() +
theme(legend.text = element_text(size = 14),
legend.title = element_text(size = 14, face = "bold"),
axis.text.x = element_text(size=12),
axis.title.x = element_text(face="bold", size=12),
axis.text.y = element_text(size=12),
axis.title.y = element_text(face="bold", size=12)) +
ylim(0, 1) +
geom_abline(intercept = 0, slope = 1)
set.seed(462451)
taus <- c(0.025, 0.05, 0.1, 0.15, 0.25, 0.5)
stability_boot_abe_path_tau <-
abe.resampling(global_model,
data = bodyfat,
include = c("abdomen", "height"),
criterion ="alpha", alpha = alphas,
tau = taus, exp.beta = TRUE,
type.resampling = "bootstrap",
num.resamples = 1000)
var_rel_freqABE_tau <- data.frame(summary(stability_boot_abe_path_tau)$var.rel.frequencies)
var_rel_freqABE_tau[,-1] * 100
# stability path for VIF
# for all alphas
data_longABE_tau <- gather(var_rel_freqABE_tau[, -1], variable, rel.freq, factor_key = TRUE)
data_longABE_tau$x <- rep(taus, ncol(var_rel_freqABE_tau)-1)
#data_long$rel.freq <- data_long$rel.freq * 100
qplot(x, rel.freq, data = data_longABE_tau, geom = c("path"), ylab = "Inclusion frequency",
xlab = "tau", colour = variable) +
theme_bw() +
theme(legend.text = element_text(size = 14),
legend.title = element_text(size = 14, face = "bold"),
axis.text.x = element_text(size=12),
axis.title.x = element_text(face="bold", size=12),
axis.text.y = element_text(size=12),
axis.title.y = element_text(face="bold", size=12)) +
ylim(0, 1)
# geom_abline(intercept = 0, slope = 1)
# what do you think? alpha legend is missing.
# could show only for specific variables. select = ....
# for alpha = 0.157
data_longABE_tau2 <- gather(var_rel_freqABE_tau[13:18, -1], variable, rel.freq, factor_key = TRUE)
data_longABE_tau2$x <- rep(taus, ncol(var_rel_freqABE_tau[13:18,])-1)
qplot(x, rel.freq, data = data_longABE_tau2, geom = c("path"), ylab = "Inclusion frequency",
xlab = "tau", colour = variable) +
theme_bw() +
theme(legend.text = element_text(size = 14),
legend.title = element_text(size = 14, face = "bold"),
axis.text.x = element_text(size=12),
axis.title.x = element_text(face="bold", size=12),
axis.text.y = element_text(size=12),
axis.title.y = element_text(face="bold", size=12)) +
ylim(0, 1)
I implemented the stability plots. There is now the option type.plot = "stability"
. I added a parameter type.stability
which can be set to "alpha"
(default) or "tau"
. This controls whether the inclusion frequency is plotted as a function of alpha or tau. The handling of tau = Inf
is not yet ideal, I will continue working on this next week.
Gregor, I would just like to thank you for all the work you are doing regarding this project. I really appreciate all your help!
Best,
Rok
From: Gregor Steiner @.> Sent: Thursday, December 15, 2022 11:09 AM To: biometrician/abe @.> Cc: rokblagus @.>; Comment @.> Subject: Re: [biometrician/abe] #19: plot function for stability paths (Issue #19)
I implemented the stability plots. There is now the option type.plot = "stability". I added a parameter type.stability which can be set to "alpha" (default) or "tau". This controls whether the inclusion frequency is plotted as a function of alpha or tau. The handling of tau = Inf is not yet ideal, I will continue working on this next week.
— Reply to this email directly, view it on GitHub https://github.com/biometrician/abe/issues/19#issuecomment-1352830605 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ANT5S56XYMKWD5TZR3FSFJDWNLU2FANCNFSM6AAAAAAR3K2V4U . You are receiving this because you commented.Message ID: @.***>
Thank you! And no problem, I'm glad I can help :)
Hi, also great job. Thanks.
Can you please change the x-axis name to a greek alpha or tau. And the y-axis should be called "Inclusion frequencies".
Please, turn around the x-axis for tau, so that it goes from largest to smallest. Then the plots for alpha and tau have the same interpretation.
Just a question: my call included a large number of different taus. The facette with Tau = 100 is before the facette with Tau = 2. All other taus are ordered correctly.
A stability for the combination criterion = "AIC" and various taus should be possible. Currently, I get a warning. The same is true for "BIC" and various taus.
Currently the function does not work if I have many taus but a single alpha, unless if I specify type.stability = "tau". Is this really necessary, can't the function automatically figure out that I would like to have a plot by tau? I would have an error if both alpha and tau are scalars, if only one is a vector I would plot a line according to this parameter, but only if both are vectors I would have a default alpha, but this could be changed to that I can also see tau. However, why wouldn't I see what is happening for both in one plot?
Yes, it would definitely be better if the function could automatically figure out whether to plot by tau or by alpha. And if one is a scalar and the other is a vector, it would be straightforward. However, in the case where both alpha and tau are a vector it is a bit tricky. That's why I added the additional parameter.
I think in the discussion last week we came to the conclusion that having multiple lines for different tau/alpha values in the same plot is not ideal. This looks pretty messy if the number of variables and or tau/alpha values is large.
But I'm happy to change this. @biometrician what do you think?
However, why wouldn't I see what is happening for both in one plot? I played around with this idea. For a small number of variables, it might work. But as a default option, it is quite likely that one receives a plot with indistinguishable lines all over the place. So one would carfully think about a way for a generalizable version, e.g. with facets or in the form of a loop-plot? In the seminar, we came to the conclusion, that we try to implement basic versions of all plots now. In the future, we can definitely think about extending these visualizations.
I agree with all that was said. My main point however was that I get an error if I had a single alpha and many taus, unless I changed the argument type.stability, which I think could easily be fixed.
Yes, you're right. I'll try to fix this tomorrow
Gregor, can you check that: When I run this code, I get a Warning regarding unkown parameters. Thanks a lot!
set.seed(462456) alphas <- c(0.01, 0.05, 0.1, 0.157, 0.2, 0.25, 0.5) stability_boot_be_path <- abe.resampling(global_model, data = bodyfat, include = c("abdomen", "height"), criterion ="alpha", alpha = alphas, tau = Inf, type.resampling = "bootstrap", num.resamples = 100)
plot(stability_boot_be_path, type.plot = "stability") Warnung: Ignoring unknown parameters: linewidth
Concerning the r code you send me with the analysis of the breast cancer data set, where the stability paths are plotted: It would be really nice if this could be put in a plot function.
The y-axis should be the inclusion proportion instead of the %, so that it is on the same scale as alpha.
Adding a diagonal would be nice if alpha is changed to distinguish between random and non-random selection.