Effect size; JASP vs jamovi

NikitaKhromov-Borisov commented 4 years ago

JASP version: 0.12.2
OS name and version: Win10
Analysis: T tests
Bug description: Cohen's effect size based on Student (d=11.8) and Welch (d=1.8) tests differ drastically whereas jamovi reports both equal (d=11.8). JASP_jamovi.zip
Expected behaviour: I don't know

Steps to reproduce:

Go to '...'
Click on '....'
Scroll down to '....'
See error

EJWagenmakers commented 4 years ago

@JohnnyDoorn can you take a look please?

JohnnyDoorn commented 4 years ago

Hi @NikitaKhromov-Borisov

Cohen's d is the standardized mean difference between the two groups. In the case of the Student's t-test (where equal variances are assumed), this means dividing the mean difference by the pooled standard deviation.

For the Welch t-test, there is no assumption of equal variances, so instead of dividing by the pooled standard error, we divide by the averaged standard deviation, which is better at taking into account large differences in variances.

It seems that in your data, there is a large difference between the variances and group sizes, so it would also make sense that taking this difference into account (i.e., using Welch instead of Student's t-test) leads to a difference in effect size.

And here is some R-code, using the rstatix R-package, for reproducing JASP's results:

library(rstatix)
dat <- read.csv("~/Downloads/welchEffectSize.csv")

cohens_d(data = dat, formula = Gal ~ Group, var.equal = FALSE) # cohen's d = -1.18
cohens_d(data = dat, formula = Gal ~ Group, var.equal = TRUE) # cohen's d = -11.8

I'm closing this issue now, but please reopen/respond if something is still unclear =)

Cheers, Johnny

NikitaKhromov-Borisov commented 4 years ago

Dear Jonny, thank you.

I feel that it would be reasonable to estimate SES (standardized effect size) with bootstrap. Moreover, in the non-parametric case it seems better to use known standardization of the Mann-Whitney U-statistics. See the attachment. SES based on the rank biserial correlation is expressed in the scale which is difficult to collate with other SES and it is not so vivid.

I am waiting for several years when you will correct the calculation of the phi Pearson's contingency coefficient for the contingency tables. Still now it is not calculated (the result is NA). Moreover, it is extremely desirable to calculate CI (confidence intervals) for any index and/or parameter: for phi, for SES in ANOVA, ets. It is very strange the CIs are not presented in the Descriptives module. It seems illogical to apply the Bayesian approach bu not to calculate credible intervals. I have observed several times that bootstrap estimations in post hoc comparisons contradict to the obtained p-value. See attached file where thу contradiction is highlighted with bold red. Here I have attached templates suggested for the presentation of the results of statistical comparisons. Please, delete space between exponential notation like 7.4e -4, it should be 7.4e-4, otherwise, the user is forced to do this by hand. Several times I invoked you to avoid report p<0.001. All p-value should be real. Strictly speaking p<0.001 means that the p-value is from 0.001 to 0 (!?). Real p-values are required at least in the case of their adjustments for multiple comparisons.

Best regards, Nikita

пт, 22 мая 2020 г. в 13:50, JohnnyDoorn notifications@github.com:

Hi @NikitaKhromov-Borisov https://github.com/NikitaKhromov-Borisov

Cohen's d is the standardized mean difference between the two groups. In the case of the Student's t-test (where equal variances are assumed), this means dividing the mean difference by the pooled standard deviation.

For the Welch t-test, there is no assumption of equal variances, so instead of dividing by the pooled standard error, we divide by the averaged standard deviation, which is better at taking into account large differences in variances.

It seems that in your data, there is a large difference between the variances and group sizes, so it would also make sense that taking this difference into account (i.e., using Welch instead of Student's t-test) leads to a difference effect size.

You can read more about this here: https://www.datanovia.com/en/lessons/t-test-effect-size-using-cohens-d-measure/#cohens-d-for-welch-test

And here is some R-code, using the rstatix R-package, for reproducing JASP's results:

library(rstatix) dat <- read.csv("~/Downloads/welchEffectSize.csv")

t.test(Gal ~ Group, data = dat, var.equal = TRUE) cohens_d(data = dat, formula = Gal ~ Group, var.equal = FALSE) # cohen's d = -1.18 cohens_d(data = dat, formula = Gal ~ Group, var.equal = TRUE) # cohen's d = -11.8

I'm closing this issue now, but please reopen/respond if something is still unclear =)

Cheers, Johnny

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jasp-stats/jasp-issues/issues/756#issuecomment-632628593, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD5FSP64PPUWO3TMEH3PTVTRSZKHNANCNFSM4NGZJ6NA .

EJWagenmakers commented 4 years ago

Hi Nikita,

Thanks for your comments -- I can deal with a few real quick. In the future, it would be best for us if we have one issue per post, otherwise it is difficult to assign these issues and keep track of them. Issues I can deal with here:

If you want real p-values you can set this in Preferences -> Results -> Display exact p-values.
I am not sure about why we have the space in the exponential notation, but it has been discussed before -- we'll discuss it again (@vandenman @AlexanderLyNL )
CIs are not presented in the Descriptive module because CIs depend on a model -- they are an inference. We do present SE in Descriptives, and some in the team already felt this was crossing a line :-)
We do provide credible intervals for our Bayesian analyses, although perhaps not by default. We're going to revamp the Bayesian analyses anyway so this will be more clear in the future.
We'll take a look at phi for the contingency table -- @vandenman ?
I'm a fan of the bootstrap but not if there are analytic expression available.

Cheers, E.J.

jasp-stats / jasp-issues

Effect size; JASP vs jamovi #756

Steps to reproduce: