Closed scrryl closed 3 months ago
Can you give a reproducible example?
@bwiernik
# Set seed for reproducibility
set.seed(250419)
# Generate random x values
x <- rnorm(n = 500,
mean = 5,
sd = 2)
# Generate y values y = 5x + e
y <- 5*x + rnorm(n = 500,
mean = 5,
sd = 2)
# Generate z as offset
z <- runif(500, min = 0, max = 6719)
mock_data <- data.frame(x, y, z) |>
dplyr::mutate(y = round(y), z = round(z)) |> # both should be whole numbers since they're counts
dplyr::filter(!x < 0, !y < 0)
# Run model
model1 <- stats::glm(y ~ x + offset(log(z)),family = "quasipoisson", data = mock_data)
performance::check_model(model1)
That code produces a qq plot for me. What are you seeing?
@bwiernik
I see this:
Is anyone able to reproduce @strengejacke @IndrajeetPatil @mattansb @DominiqueMakowski @rempsyc
@scrryl Are you getting any errors or warnings? What happens if you make the plot window/pane larger?
nope! no errors or warnings
when I make pane larger:
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.4.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] Matrix_1.6-1.1 gtable_0.3.4 jsonlite_1.8.8 dplyr_1.1.4 compiler_4.3.2
[6] tidyselect_1.2.1 Rcpp_1.0.12 VGAM_1.1-9 see_0.8.3.5 textshaping_0.3.6
[11] systemfonts_1.0.5 splines_4.3.2 scales_1.3.0 marginaleffects_0.14.0 readxl_1.4.3
[16] lattice_0.21-9 ggplot2_3.5.0 R6_2.5.1 labeling_0.4.3 patchwork_1.2.0
[21] generics_0.1.3 ggrepel_0.9.3 tibble_3.2.1 insight_0.19.10 munsell_0.5.0
[26] shadowtext_0.1.2 pillar_1.9.0 rlang_1.1.3 easystats_0.7.1.1 utf8_1.2.4
[31] performance_0.11.0.3 cli_3.6.2 mgcv_1.9-0 withr_3.0.0 magrittr_2.0.3
[36] grid_4.3.2 rstudioapi_0.15.0 nlme_3.1-163 lifecycle_1.0.4 vctrs_0.6.5
[41] glue_1.7.0 data.table_1.14.10 farver_2.1.1 cellranger_1.1.0 sessioninfo_1.2.2
[46] ragg_1.2.5 stats4_4.3.2 fansi_1.0.6 colorspace_2.1-0 purrr_1.0.2
[51] tools_4.3.2 pkgconfig_2.0.3
R version 4.1.3 (2022-03-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 13.6.3
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gt_0.8.0 modelsummary_1.3.0 corrplot_0.92 Hmisc_4.7-0
[5] Formula_1.2-4 survival_3.2-13 lattice_0.20-45 scales_1.3.0
[9] lme4_1.1-29 Matrix_1.4-0 jtools_2.2.0 forcats_0.5.1
[13] stringr_1.5.1 dplyr_1.1.2 purrr_1.0.1 readr_2.1.2
[17] tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.4 tidyverse_1.3.2
loaded via a namespace (and not attached):
[1] readxl_1.4.0 backports_1.4.1 systemfonts_1.0.4 sp_1.4-7
[5] splines_4.1.3 crosstalk_1.2.0 listenv_0.8.0 leaflet_2.1.1
[9] digest_0.6.29 htmltools_0.5.2 fansi_1.0.3 DHARMa_0.4.6
[13] magrittr_2.0.3 checkmate_2.1.0 googlesheets4_1.0.0 cluster_2.1.2
[17] see_0.8.0 tzdb_0.3.0 globals_0.15.0 modelr_0.1.8
[21] svglite_2.1.0 jpeg_0.1-9 colorspace_2.1-0 ggrepel_0.9.1
[25] rvest_1.0.4 haven_2.5.0.9000 xfun_0.30 leafem_0.2.0
[29] crayon_1.5.1 jsonlite_1.8.0 glue_1.6.2 kableExtra_1.3.4.9000
[33] gtable_0.3.0 gargle_1.2.0 webshot_0.5.5 car_3.1-0
[37] abind_1.4-5 DBI_1.1.2 rstatix_0.7.0 Rcpp_1.0.8.3
[41] performance_0.11.0 viridisLite_0.4.2 htmlTable_2.4.0 units_0.8-0
[45] foreign_0.8-82 proxy_0.4-26 stats4_4.1.3 datawizard_0.10.0
[49] htmlwidgets_1.5.4 httr_1.4.7 RColorBrewer_1.1-3 ellipsis_0.3.2
[53] pkgconfig_2.0.3 farver_2.1.1 nnet_7.3-17 dbplyr_2.2.0
[57] utf8_1.2.2 tidyselect_1.2.0 labeling_0.4.3 rlang_1.1.1
[61] munsell_0.5.0 cellranger_1.1.0 tools_4.1.3 cli_3.6.1
[65] generics_0.1.2 broom_0.8.0 evaluate_0.23 fastmap_1.1.1
[69] yaml_2.3.5 tables_0.9.10 knitr_1.39 fs_1.5.2
[73] pander_0.6.5 satellite_1.0.4 future_1.26.1 nlme_3.1-155
[77] xml2_1.3.3 compiler_4.1.3 rstudioapi_0.15.0 png_0.1-7
[81] e1071_1.7-9 ggsignif_0.6.4 reprex_2.0.1 stringi_1.7.6
[85] classInt_0.4-3 nloptr_2.0.1 ggsci_2.9 vctrs_0.6.2
[89] pillar_1.9.0 lifecycle_1.0.4 furrr_0.3.1 insight_0.19.10
[93] data.table_1.14.2 cowplot_1.1.1 raster_3.5-15 mapview_2.11.0
[97] patchwork_1.1.1 R6_2.5.1 latticeExtra_0.6-29 KernSmooth_2.23-20
[101] gridExtra_2.3 parallelly_1.32.0 codetools_0.2-18 boot_1.3-28
[105] MASS_7.3-55 gtools_3.9.2.2 assertthat_0.2.1 withr_2.5.0
[109] broom.mixed_0.2.9.4 mgcv_1.8-39 parallel_4.1.3 hms_1.1.1
[113] terra_1.5-21 grid_4.1.3 rpart_4.1.16 class_7.3-20
[117] minqa_1.2.4 rmarkdown_2.14 carData_3.0-5 googledrive_2.0.0
[121] ggpubr_0.4.0 sf_1.0-7 lubridate_1.8.0 base64enc_0.1-3
Just for clarity, can you run your code above from a fresh R session and post the session info? Please run the code from your post above (as I edited it so it would run).
I'm also getting only 3 plots:
# Set seed for reproducibility
set.seed(250419)
# Generate random x values
x <- rnorm(n = 500,
mean = 5,
sd = 2)
# Generate y values y = 5x + e
y <- 5*x + rnorm(n = 500,
mean = 5,
sd = 2)
# Generate z as offset
z <- runif(500, min = 0, max = 6719)
mock_data <- data.frame(x, y, z) |>
dplyr::mutate(y = round(y), z = round(z)) |> # both should be whole numbers since they're counts
dplyr::filter(!x < 0, !y < 0)
# Run model
model1 <- stats::glm(y ~ x + offset(log(z)),family = "quasipoisson", data = mock_data)
performance::check_model(model1)
Created on 2024-04-03 with reprex v2.1.0
Could be related to this message? (But the plot still works)
(c_norm <- performance::check_normality(model1))
#> There's no formal statistical test for normality for generalized linear model.
#> Instead, please use `simulate_residuals()` and `check_residuals()` to check for uniformity of residuals.
plot(c_norm)
Honestly that seems reasonable --- quasipoisson residuals shouldn't be normal. Do you know where the different behavior is coming from @strengejacke ?
quasipoisson is not supported by DHARMa, that's why it fails. You have to explicitly set residual_type = "normal"
, until we fixed this:
set.seed(250419)
# Generate random x values
x <- rnorm(n = 500,
mean = 5,
sd = 2)
# Generate y values y = 5x + e
y <- 5*x + rnorm(n = 500,
mean = 5,
sd = 2)
# Generate z as offset
z <- runif(500, min = 0, max = 6719)
mock_data <- data.frame(x, y, z) |>
dplyr::mutate(y = round(y), z = round(z)) |> # both should be whole numbers since they're counts
dplyr::filter(!x < 0, !y < 0)
# Run model
model1 <- stats::glm(y ~ x + offset(log(z)),family = "quasipoisson", data = mock_data)
performance::check_model(model1, residual_type = "normal")
Created on 2024-04-03 with reprex v2.1.0
This is the error:
set.seed(250419)
# Generate random x values
x <- rnorm(
n = 500,
mean = 5,
sd = 2
)
# Generate y values y = 5x + e
y <- 5 * x + rnorm(
n = 500,
mean = 5,
sd = 2
)
# Generate z as offset
z <- runif(500, min = 0, max = 6719)
mock_data <- data.frame(x, y, z) |>
# both should be whole numbers since they're counts
datawizard::data_modify(y = round(y), z = round(z)) |>
datawizard::data_filter(!x < 0, !y < 0)
# Run model
model1 <- glm(y ~ x + offset(log(z)), family = "quasipoisson", data = mock_data)
DHARMa::simulateResiduals(model1)
#> Error in simulate.lm(object, nsim = nsim, ...): family 'quasipoisson' not implemented
Created on 2024-04-03 with reprex v2.1.0
However, at least in simulateResiduals()
, there's a check for that family:
if (is.null(integerResponse)) {
if (family$family %in% c("binomial", "poisson", "quasibinomial",
"quasipoisson", "Negative Binom", "nbinom2", "nbinom1",
"genpois", "compois", "truncated_poisson", "truncated_nbinom2",
"truncated_nbinom1", "betabinomial", "Poisson", "Tpoisson",
"COMPoisson", "negbin", "Tnegbin") | grepl("Negative Binomial",
family$family))
integerResponse = TRUE
else integerResponse = FALSE
}
So the package stops at a later point. @florianhartig is it correct that quasi-families are not yet supported, or is this not intended? I could open an issue at the DHARMa repo.
Ah, I see. It's is stats::simulate()
where the error comes from.
quasipoisson is not supported by DHARMa, that's why it fails. You have to explicitly set
residual_type = "normal"
, until we fixed this:set.seed(250419) # Generate random x values x <- rnorm(n = 500, mean = 5, sd = 2) # Generate y values y = 5x + e y <- 5*x + rnorm(n = 500, mean = 5, sd = 2) # Generate z as offset z <- runif(500, min = 0, max = 6719) mock_data <- data.frame(x, y, z) |> dplyr::mutate(y = round(y), z = round(z)) |> # both should be whole numbers since they're counts dplyr::filter(!x < 0, !y < 0) # Run model model1 <- stats::glm(y ~ x + offset(log(z)),family = "quasipoisson", data = mock_data) performance::check_model(model1, residual_type = "normal")
Created on 2024-04-03 with reprex v2.1.0
residual_type = "normal" was the fix. I'll keep a lookout for updates.
Thank you all so much!
Note that quasipoisson model residuals should not be normally distributed, so this plot isn't really meaningful
residual_type = "normal" was the fix. I'll keep a lookout for updates.
Make sure you have the latest easystats package installed:
install.packages("easystats")
then run:
easystats::install_latest()
and you should be fine.
I just updated performance to the newest version and my code no longer works. Check_model produces 3 of the 4 plots and the qq plot is not one of them. Interestingly, I cannot call the plot directly using the check = "qq" flag.
Any thoughts?