easystats / correlation

:link: Methods for Correlation Analysis
https://easystats.github.io/correlation/
Other
431 stars 55 forks source link

Bayesian correlations not working with the latest version of `{parameters}` #269

Closed IndrajeetPatil closed 1 year ago

IndrajeetPatil commented 1 year ago

{ggstatsplot} is broken because of this issue in {correlation}.

I can't track down what change in the ecosystem led to the change in behaviour of the CRAN-version of correlation, because the following used to work:

library(correlation)
library(dplyr, warn.conflicts = FALSE)
df <- ggplot2::msleep

correlation(filter(df, vore == "carni"), bayesian = TRUE)
#> Warning in genhypergeo_series_pos(U = c((n - 1)/2, (n - 1)/2), L = ((n + :
#> Series not converged.
#> Error in rbind(deparse.level, ...): numbers of columns of arguments do not match

Created on 2022-10-07 with reprex v2.0.2

DominiqueMakowski commented 1 year ago

Warning in genhypergeo_series_pos

I started having this warning fairly often too for bayesian correlations (but no errors in correlation), did something change in BayesFactor to cause that?

IndrajeetPatil commented 1 year ago

I doubt it, since BayesFactor was last updated in July, and we would have noticed such a change way earlier instead of now.

I have a hunch that this has to do with a change in either parameters or bayestestR.

strengejacke commented 1 year ago

Here's a reprex:

library(parameters)
library(dplyr, warn.conflicts = FALSE)
df <- as.data.frame(ggplot2::msleep)

x <- filter(df, vore == "carni")
rez <- BayesFactor::correlationBF(x$sleep_total, x$awake, rscale = "medium")
#> Warning in genhypergeo_series_pos(U = c((n - 1)/2, (n - 1)/2), L = ((n + :
#> Series not converged.
rez
#> Bayes factor analysis
#> --------------
#> [1] Alt., r=0.333 : NA ±0%
#> 
#> Against denominator:
#>   Null, rho = 0 
#> ---
#> Bayes factor type: BFcorrelation, Jeffreys-beta*

params <- parameters::model_parameters(rez)
params
#> Bayesian correlation analysis
#> 
#> Parameter | Median |         95% CI |   pd |         Prior
#> ----------------------------------------------------------
#> rho       |  -1.00 | [-1.00, -1.00] | 100% | Beta (3 +- 3)
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a MCMC distribution approximation.

Created on 2022-10-07 with reprex v2.0.2

DominiqueMakowski commented 1 year ago

could it be with the removal of ROPE?

strengejacke commented 1 year ago

parameters uses datawizard::remove_empty(), which removes columns with completely NA (which is a new behaviour, I think), so the BF is missing for this particular combination of variables, which causes the error in correlation.

A fix would be to check here: https://github.com/easystats/correlation/blob/e31942ec7baeeae88ad9bbcf3dab9db5c04bccad/R/cor_test_bayes.R#L83

if the BF column exists, and if not, add BF <- NA. We must ensure the columns have the right order (so we might need to sort), so params can be rbind() to the final results (see cor_test()).

strengejacke commented 1 year ago

We should add @IndrajeetPatil example as test, once the issue is resolved.

DominiqueMakowski commented 1 year ago

shouldnt we add that in parameters? The output should include the BF col even if NA

strengejacke commented 1 year ago

Yes, I agree. But maybe for now it's better to have this fix in correlation, too? I just submitted parameters two days ago. 😬

DominiqueMakowski commented 1 year ago

haha okay makes sense then

strengejacke commented 1 year ago

There must have been a reason why we changed the behaviour in model_parameters.BFBayesFactor(). We added

  # ==== remove rows and columns with complete `NA`s
  out <- datawizard::remove_empty(out)

I think it's related to the new API of effectsize, which might return empty columns or rows that we didn't wanted to have? But we were not considering that the BF could be NA.

strengejacke commented 1 year ago

Good that I have copied the clean branch after CRAN submission of parameters (https://github.com/easystats/parameters/tree/release_0_19_0_branch) - the main-branch now includes several breaking changes that are not properly tested and might break other packages, so it's possible to include the patch into the release_0_19_0_branch and submit a fix to CRAN, if necessary. :-)

IndrajeetPatil commented 1 year ago

I think it's related to the new API of effectsize, which might return empty columns or rows that we didn't wanted to have? But we were not considering that the BF could be NA.

Yes, the change in parameters is good. We just need to make a few exceptions, where necessary.

I have added tests and done some other preparations necessary for a new CRAN release of correlation.

Are there any other high prio issues that need to be resolved before this happens?

cc @bwiernik