Closed IndrajeetPatil closed 3 years ago
Same issue for BBreg
:
# setup
set.seed(18)
library(PROreg)
# we simulate a covariate, fix the paramters of the beta-binomial
# distribution and simulate a response variable.
# then we apply the model, and try to get the same values.
k <- 1000
m <- 10
x <- rnorm(k, 5, 3)
beta <- c(-10, 2)
p <- 1 / (1 + exp(-1 * (beta[1] + beta[2] * x)))
phi <- 1.2
y <- PROreg::rBB(k, m, p, phi)
# model
mod_BBreg <- PROreg::BBreg(y ~ x, m)
library(parameters)
dof(mod_BBreg)
#> [1] 998 998
parameters(mod_BBreg)
#> # Fixed Effects
#>
#> Parameter | Log-Odds | SE | 95% CI | t | p
#> ---------------------------------------------------------------
#> Intercept | -10.13 | 0.48 | [-11.08, -9.19] | -21.00 | < .001
#> x | 2.03 | 0.09 | [ 1.85, 2.22] | 21.51 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
Ditto for multinom
:
# setup
set.seed(123)
library(nnet)
library(MASS)
utils::example(topic = birthwt, echo = FALSE)
# model
bwt.mu <-
nnet::multinom(
formula = low ~ .,
data = bwt,
trace = FALSE
)
library(parameters)
dof(bwt.mu)
#> [1] 178 178 178 178 178 178 178 178 178 178 178
parameters(bwt.mu)
#> Parameter | Log-Odds | SE | 95% CI | t | p
#> -------------------------------------------------------------------
#> (Intercept) | 0.82 | 1.24 | [-1.62, 3.26] | 0.66 | 0.508
#> age | -0.04 | 0.04 | [-0.11, 0.04] | -0.96 | 0.336
#> lwt | -0.02 | 7.08e-03 | [-0.03, 0.00] | -2.21 | 0.027
#> race [black] | 1.19 | 0.54 | [ 0.14, 2.24] | 2.22 | 0.026
#> race [other] | 0.74 | 0.46 | [-0.16, 1.65] | 1.60 | 0.109
#> smokeTRUE | 0.76 | 0.43 | [-0.08, 1.59] | 1.78 | 0.075
#> ptd [TRUE] | 1.34 | 0.48 | [ 0.40, 2.29] | 2.80 | 0.005
#> htTRUE | 1.91 | 0.72 | [ 0.50, 3.33] | 2.65 | 0.008
#> uiTRUE | 0.68 | 0.46 | [-0.23, 1.59] | 1.46 | 0.143
#> ftv [1] | -0.44 | 0.48 | [-1.38, 0.50] | -0.91 | 0.363
#> ftv [2+] | 0.18 | 0.46 | [-0.72, 1.07] | 0.39 | 0.695
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
And mhurdle
:
# setup
data("Interview", package = "mhurdle")
library(mhurdle)
#> Loading required package: Formula
#> Loading required package: truncreg
#> Loading required package: maxLik
#> Loading required package: miscTools
#>
#> Please cite the 'maxLik' package as:
#> Henningsen, Arne and Toomet, Ott (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26(3), 443-458. DOI 10.1007/s00180-010-0217-1.
#>
#> If you have questions, suggestions, or comments regarding the 'maxLik' package, please use a forum or 'tracker' at maxLik's R-Forge site:
#> https://r-forge.r-project.org/projects/maxlik/
#> Loading required package: survival
#> Loading required package: texreg
#> Version: 1.37.5
#> Date: 2020-06-17
#> Author: Philip Leifeld (University of Essex)
#>
#> Consider submitting praise using the praise or praise_interactive functions.
#> Please cite the JSS article in your publications -- see citation("texreg").
# independent double hurdle model
idhm <- mhurdle::mhurdle(
formula = vacations ~ car + size | linc + linc2 | 0,
data = Interview,
dist = "ln",
h2 = TRUE,
method = "bfgs"
)
library(parameters)
dof(idhm)
#> [1] 992 992 992 992 992 992 992 992
parameters(idhm)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> -----------------------------------------------------------------
#> (Intercept) | -0.20 | 0.22 | [-0.64, 0.24] | -0.91 | 0.364
#> linc | 1.34 | 0.18 | [ 0.99, 1.68] | 7.63 | < .001
#> linc2 | 0.13 | 0.14 | [-0.13, 0.40] | 0.98 | 0.327
#>
#> # Zero-Inflated
#>
#> Parameter | Log-Odds | SE | 95% CI | t | p
#> ---------------------------------------------------------------
#> (Intercept) | -1.05 | 0.17 | [-1.39, -0.72] | -6.12 | < .001
#> car | 0.04 | 0.07 | [-0.10, 0.17] | 0.54 | 0.586
#> size | 0.19 | 0.05 | [ 0.09, 0.29] | 3.59 | < .001
#>
#> # Auxiliary
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> -------------------------------------------------------------
#> sd sd | 1.13 | 0.13 | [0.87, 1.39] | 8.60 | < .001
#> pos | 0.59 | 0.18 | [0.24, 0.93] | 3.33 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
And truncreg
:
set.seed(123)
library(truncreg)
#> Loading required package: maxLik
#> Loading required package: miscTools
#>
#> Please cite the 'maxLik' package as:
#> Henningsen, Arne and Toomet, Ott (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26(3), 443-458. DOI 10.1007/s00180-010-0217-1.
#>
#> If you have questions, suggestions, or comments regarding the 'maxLik' package, please use a forum or 'tracker' at maxLik's R-Forge site:
#> https://r-forge.r-project.org/projects/maxlik/
library(survival)
# data
data("tobin", package = "survival")
# model
cragg_trunc <-
truncreg::truncreg(
formula = durable ~ age + quant,
data = tobin,
subset = durable > 0
)
library(parameters)
dof(cragg_trunc)
#> [1] 3 3 3 3
parameters(cragg_trunc)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> ------------------------------------------------------------------
#> (Intercept) | 12.60 | 9.22 | [-5.47, 30.67] | 1.37 | 0.172
#> age | 0.43 | 0.23 | [-0.01, 0.88] | 1.90 | 0.058
#> quant | -0.12 | 0.03 | [-0.18, -0.05] | -3.64 | < .001
#> sigma | 1.65 | 0.57 | [ 0.54, 2.77] | 2.91 | 0.004
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
And gls
:
set.seed(123)
library(nlme)
# model
mod_gls <-
nlme::gls(
model = follicles ~ sin(2 * pi * Time) + cos(2 * pi * Time),
data = Ovary,
correlation = corAR1(form = ~ 1 | Mare)
)
library(parameters)
dof(mod_gls)
#> [1] 305 305 305
parameters(mod_gls)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> -------------------------------------------------------------------------
#> (Intercept) | 12.22 | 0.66 | [10.91, 13.52] | 18.38 | < .001
#> sin(2 * pi * Time) | -2.77 | 0.65 | [-4.04, -1.51] | -4.30 | < .001
#> cos(2 * pi * Time) | -0.90 | 0.70 | [-2.27, 0.47] | -1.29 | 0.198
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
And rq
:
set.seed(123)
library(quantreg)
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
# data
data(stackloss)
# model
mod_rq <-
quantreg::rq(
formula = stack.loss ~ .,
data = stackloss,
method = "br"
)
library(parameters)
dof(mod_rq)
#> [1] 17 17 17 17
parameters(mod_rq)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> --------------------------------------------------------------------
#> (Intercept) | -39.69 | 7.14 | [-53.69, -25.69] | -5.56 | < .001
#> Air Flow | 0.83 | 0.13 | [ 0.58, 1.08] | 6.55 | < .001
#> Water Temp | 0.57 | 0.34 | [ -0.10, 1.24] | 1.68 | 0.111
#> Acid Conc | -0.06 | 0.06 | [ -0.18, 0.06] | -1.01 | 0.328
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
nlrq
:
# setup
set.seed(123)
library(quantreg)
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
# preparing data
Dat <- NULL
Dat$x <- rep(1:25, 20)
Dat$y <- stats::SSlogis(Dat$x, 10, 12, 2) * rnorm(500, 1, 0.1)
# then fit the median using nlrq
Dat.nlrq <-
quantreg::nlrq(
formula = y ~ SSlogis(x, Asym, mid, scal),
data = Dat,
tau = 0.5,
trace = FALSE
)
library(parameters)
dof(Dat.nlrq)
#> [1] 497 497 497
parameters(Dat.nlrq)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> -----------------------------------------------------------------
#> Asym | 9.82 | 0.11 | [ 9.58, 10.05] | 84.98 | < .001
#> mid | 11.84 | 0.07 | [11.70, 11.98] | 173.72 | < .001
#> scal | 1.96 | 0.02 | [ 1.91, 2.01] | 81.93 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
ivprobit
:
# setup
set.seed(123)
library(ivprobit)
data(eco)
# model
pro <-
ivprobit::ivprobit(
formula = d2 ~ ltass + roe + div | eqrat + bonus | ltass + roe + div + gap + cfa,
data = eco
)
library(parameters)
dof(pro)
#> [1] 788 788 788 788 788 788
parameters(pro)
#> # Fixed Effects
#>
#> Parameter | Log-Odds | SE | 95% CI | t | p
#> ------------------------------------------------------------------
#> Intercep | -16.86 | 9.73 | [-35.94, 2.21] | -1.73 | 0.084
#> ltass | 0.95 | 0.62 | [ -0.27, 2.16] | 1.53 | 0.127
#> roe | 0.07 | 0.11 | [ -0.16, 0.29] | 0.59 | 0.557
#> div | 2.04e-06 | 4.27e-06 | [ -0.00, 0.00] | 0.48 | 0.634
#> eqrat | 9.44 | 12.94 | [-15.92, 34.79] | 0.73 | 0.466
#> bonus | -1.42e-06 | 2.82e-06 | [ -0.00, 0.00] | -0.50 | 0.615
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
garch
:
set.seed(123)
library(tseries)
#> Registered S3 method overwritten by 'quantmod':
#> method from
#> as.zoo.data.frame zoo
data(EuStockMarkets)
# model
dax <- diff(log(EuStockMarkets))[, "DAX"]
dax.garch <- tseries::garch(x = dax, trace = FALSE)
library(parameters)
dof(dax.garch)
#> [1] 1856 1856 1856
parameters(dax.garch)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> ------------------------------------------------------------------
#> a0 | 4.64e-06 | 1.96e-06 | [0.00, 0.00] | 6.14 | < .001
#> a1 | 0.07 | 0.01 | [0.05, 0.09] | 6.07 | < .001
#> b1 | 0.89 | 0.02 | [0.86, 0.92] | 53.82 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using
#> Wald t-distribution approximation.
Created on 2021-10-16 by the reprex package (v2.0.1)
I think that's all for now. Sorry, Daniel 😢
Need to boost that coverage...
Need to boost that coverage...
We support so many models, that it's kind of impossible to check all of them 😢 , nor should we try to...
Another one: bigglm
set.seed(123)
library(biglm)
#> Loading required package: DBI
data(trees)
# model
mod_bigglm <-
biglm::bigglm(
formula = log(Volume) ~ log(Girth) + log(Height),
data = trees,
chunksize = 10,
sandwich = TRUE
)
library(parameters)
dof(mod_bigglm)
#> [1] 28 28 28
parameters(mod_bigglm)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> -------------------------------------------------------------------
#> (Intercept) | -6.63 | 0.73 | [-8.06, -5.21] | -9.11 | < .001
#> Girth [log] | 1.98 | 0.06 | [ 1.87, 2.09] | 35.62 | < .001
#> Height [log] | 1.12 | 0.19 | [ 0.74, 1.49] | 5.82 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using a
#> Wald t-distribution approximation.
Created on 2021-10-17 by the reprex package (v2.0.1)
Also complmrob
:
# setup
set.seed(123)
library(complmrob)
# data
crimes <- data.frame(
lifeExp = state.x77[, "Life Exp"],
USArrests[, c("Murder", "Assault", "Rape")]
)
# model
mUSArr <- complmrob::complmrob(formula = lifeExp ~ ., data = crimes)
library(parameters)
dof(mUSArr)
#> [1] 46 46 46 46
parameters(mUSArr)
#> # Fixed Effects
#>
#> Parameter | Coefficient | SE | 95% CI | t | p
#> -------------------------------------------------------------------
#> (Intercept) | 70.65 | 0.43 | [69.82, 71.49] | 165.81 | < .001
#> Murder | -2.23 | 0.43 | [-3.07, -1.40] | -5.24 | < .001
#> Assault | -1.05 | 0.93 | [-2.86, 0.77] | -1.13 | 0.264
#> Rape | 3.28 | 0.61 | [ 2.09, 4.47] | 5.39 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed using a
#> Wald t-distribution approximation.
Created on 2021-10-17 by the reprex package (v2.0.1)
We support so many models, that it's kind of impossible to check all of them 😢 , nor should we try to...
I guess we don't need to if we have you!
BBreg/BBmm only report p-values based on normal distribution assumptions, even if we have t-stat and residual df. How should we deal with that?
@IndrajeetPatil would be great if you can add tests for these models here: https://github.com/easystats/parameters/blob/main/tests/testthat/test-model_parameters_df.R
I started adding first tests, you may just copy that pattern - we need to check whether df, CI (low) and p are correct for default (which should be residual for models in this thread) and normal.
Would be great if p-values are not rounded to 0
, so we see differences between normal and default (residual).
Just run model_parameters(model)
, and then create the code for the test file using something like:
dput(round(params$CI_low, 5))
I bet we will discover some inconsistencies, in particular where model classes have "fixed" p-values. We then have to make sure that p_value()
for these classes both allows to return the default and the "normally distributed" p-values (like I changed now for ivprobit
).
actually, we only need to make degrees_of_freedom()
work properly for each class, then we could remove almost all p_value()
and ci()
methods, and only use p_value.default()
and ci.default()
now.
let's see if I'm right...
Is this a printing issue or is the df column missing entirely? Would it be easier to just always print the df column instead of the clever but inconsistent header formatting?
Is this a printing issue
It's a printing issue, because Inf
is erroneously returned now, not the residual df. It's due to the "API" changes, so no issue of the print-method itself, but rather a (new introduced) bug due to the recent changes from #570. But I think I fixed all of them now, and added a bunch of tests. We may add more tests in the future, to cover as many models as possible. We can add all tests to this file:
https://github.com/easystats/parameters/blob/main/tests/testthat/test-model_parameters_df.R
This is only run on GitHub, so no timing issues on CRAN.
Ok, I think it's working now, and it's implemented for a lot of models. There are still some classes where we only return the "default" p and CI. That is something we can revise step-by-step in future updates, as it's not that urgent. The most common model classes should be supported and work correct now.
It's a printing issue
It was not a printing issue though. The degrees of freedom column was missing from the dataframe itself.
Thanks for fixing this quickly!
Created on 2021-10-16 by the reprex package (v2.0.1)