Closed IndrajeetPatil closed 1 year ago
Same with rstanarm on line 128 of this file:
https://github.com/easystats/report/blob/main/tests/testthat/test-report_performance.R
For now you can define survival::Surv
outside the formula, but it's not a clean fix:
set.seed(123)
Surv <- survival::Surv
mod_survreg <- survival::survreg(
formula = Surv(futime, fustat) ~ ecog.ps + rx,
data = survival::ovarian,
dist = "logistic"
)
report::report(mod_survreg)
#> Can't calculate log-loss.
#> Can't calculate proper scoring rules for models without integer response values.
#> `performance_pcp()` only works for models with binary response values.
#> Can't calculate log-loss.
#> Can't calculate proper scoring rules for models without integer response values.
#> `performance_pcp()` only works for models with binary response values.
#> We fitted a logistic model to predict Surv(futime, fustat) with ecog.ps and rx
#> (formula: Surv(futime, fustat) ~ ecog.ps + rx). The model's explanatory power
#> is weak (Nagelkerke's R2 = 0.07). The model's intercept, corresponding to
#> ecog.ps = 0 and rx = 0, is at 667.43 (95% CI [-415.59, 1750.45], p = 0.227).
#> Within this model:
#>
#> - The effect of ecog ps is statistically non-significant and negative (beta =
#> -210.59, 95% CI [-726.18, 305.01], p = 0.423; Std. beta = -107.06, 95% CI
#> [-369.19, 155.07])
#> - The effect of rx is statistically non-significant and positive (beta =
#> 320.10, 95% CI [-194.11, 834.32], p = 0.222; Std. beta = 163.22, 95% CI
#> [-98.98, 425.42])
#> - The effect of Log(scale) is statistically significant and positive (beta =
#> 5.82, 95% CI [5.35, 6.29], p < .001; Std. beta = 5.82, 95% CI [5.35, 6.29])
#>
#> Standardized parameters were obtained by fitting the model on a standardized
#> version of the dataset. 95% Confidence Intervals (CIs) and p-values were
#> computed using a Wald z-distribution approximation.
Created on 2023-04-03 with reprex v2.0.2
The problem actually comes from insight
because find_response()
doesn't remove namespaces:
mod_ns <- survival::survreg(
formula = survival::Surv(futime, fustat) ~ ecog.ps + rx,
data = survival::ovarian,
dist = "logistic"
)
suppressPackageStartupMessages(library(survival))
mod_no_ns <- survreg(
formula = Surv(futime, fustat) ~ ecog.ps + rx,
data = ovarian,
dist = "logistic"
)
### Same output (esp. same colnames)
head(insight::get_data(mod_ns, source = "mf"))
#> futime fustat Surv(futime, fustat) ecog.ps rx
#> 1 59 1 59 1 1
#> 2 115 1 115 1 1
#> 3 156 1 156 2 1
#> 4 421 0 421+ 1 2
#> 5 431 1 431 1 1
#> 6 448 0 448+ 2 1
head(insight::get_data(mod_no_ns, source = "mf"))
#> futime fustat Surv(futime, fustat) ecog.ps rx
#> 1 59 1 59 1 1
#> 2 115 1 115 1 1
#> 3 156 1 156 2 1
#> 4 421 0 421+ 1 2
#> 5 431 1 431 1 1
#> 6 448 0 448+ 2 1
### Different output
insight::find_response(mod_ns)
#> [1] "survival::Surv(futime, fustat)"
insight::find_response(mod_no_ns)
#> [1] "Surv(futime, fustat)"
This means that it keeps the response in the variables to standardize instead of removing it:
We could insight::clean_names(insight::find_response(mod_ns))
@strengejacke or maybe insight::find_response()
should always remove the <pkg>::
part?
That might be a problem if the result is to be used later and the namespace isn't available..
@rempsyc This should now work correctly:
stan_glmer <- rstanarm::stan_glmer
x <- rstanarm::stan_lmer(Sepal.Length ~ Petal.Length + (1 | Species),
data = iris, refresh = 0, iter = 1000, seed = 333
)
report::report(x)
#> We fitted a Bayesian linear mixed model (estimated using MCMC sampling with 4
#> chains of 1000 iterations and a warmup of 500) to predict Sepal.Length with
#> Petal.Length (formula: Sepal.Length ~ Petal.Length). The model included Species
#> as random effect (formula: ~1 | Species). Priors over parameters were set as
#> normal (mean = 0.00, SD = 1.17) distributions. The model's explanatory power is
#> substantial (R2 = 0.83, 95% CI [0.79, 0.87], adj. R2 = 0.83) and the part
#> related to the fixed effects alone (marginal R2) is of 0.95 (95% CI [0.93,
#> 0.97]). The model's intercept, corresponding to Petal.Length = 0, is at 2.57
#> (95% CI [1.44, 3.56]). Within this model:
#>
#> - The effect of Petal Length (Median = 0.87, 95% CI [0.74, 1.00]) has a 100.00%
#> probability of being positive (> 0), 100.00% of being significant (> 0.04), and
#> 100.00% of being large (> 0.25). The estimation successfully converged (Rhat =
#> 1.000) and the indices are reliable (ESS = 1416)
#>
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.04| and |0.25|. Convergence and stability of the Bayesian sampling
#> has been assessed using R-hat, which should be below 1.01 (Vehtari et al.,
#> 2019), and Effective Sample Size (ESS), which should be greater than 1000
#> (Burkner, 2017).
set.seed(123)
mod_survreg <- survival::survreg(
formula = survival::Surv(futime, fustat) ~ ecog.ps + rx,
data = survival::ovarian,
dist = "logistic"
)
report::report(mod_survreg)
#> Can't calculate log-loss.
#> Can't calculate proper scoring rules for models without integer response values.
#> `performance_pcp()` only works for models with binary response values.
#> Can't calculate log-loss.
#> Can't calculate proper scoring rules for models without integer response values.
#> `performance_pcp()` only works for models with binary response values.
#> We fitted a logistic model to predict survival::Surv(futime, fustat) with
#> ecog.ps and rx (formula: survival::Surv(futime, fustat) ~ ecog.ps + rx). The
#> model's explanatory power is weak (Nagelkerke's R2 = 0.07). The model's
#> intercept, corresponding to ecog.ps = 0 and rx = 0, is at 667.43 (95% CI
#> [-415.59, 1750.45], p = 0.227). Within this model:
#>
#> - The effect of ecog ps is statistically non-significant and negative (beta =
#> -210.59, 95% CI [-726.18, 305.01], p = 0.423; Std. beta = -107.06, 95% CI
#> [-369.19, 155.07])
#> - The effect of rx is statistically non-significant and positive (beta =
#> 320.10, 95% CI [-194.11, 834.32], p = 0.222; Std. beta = 163.22, 95% CI
#> [-98.98, 425.42])
#> - The effect of Log(scale) is statistically significant and positive (beta =
#> 5.82, 95% CI [5.35, 6.29], p < .001; Std. beta = 5.82, 95% CI [5.35, 6.29])
#>
#> Standardized parameters were obtained by fitting the model on a standardized
#> version of the dataset. 95% Confidence Intervals (CIs) and p-values were
#> computed using a Wald z-distribution approximation.
Created on 2023-04-13 with reprex v2.0.2
Nice!! Thanks @etiennebacher ! :D
As noted by @rempsyc in https://github.com/easystats/report/pull/361
Created on 2023-04-03 with reprex v2.0.2
Created on 2023-04-03 with reprex v2.0.2