jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
59 stars 29 forks source link

Request for Letter-Based Representation of All-Pairwise Comparisons in post-hoc testing #342

Closed CJAB93 closed 6 months ago

CJAB93 commented 5 years ago

I was wondering if you could implement that a post-hoc tests will not only be reported using the pairwise comparison p-values, but also sorted to significance level (like multcompView::multcompLetters in R does).

So instead of this:

F P 2-1 0.0001 3-1 0.7998 3-2 0.0050

Also this:

F Level 1 a 2 b 3 a

TimKDJ commented 5 years ago

@JohnnyDoorn: I think this is meant for ANOVA's?

CJAB93 commented 5 years ago

Yes, it's about the post-hoc comparison after performing an ANOVA. It would greatly improve the convenience for me (and I can only assume a lot of others) if there could be a letter-based grouping in addition to just the P-value for each comparison. Especially with a lot of comparisons (i.e. with >5 treatments) it is quite tough to manually calculate the letter-based grouping (necessary for graphs and tables), while it would/could be quite simple automatically. Thanks for responding/looking at the request!

CJAB93 commented 5 years ago

Any chance this could be implemented? @JohnnyDoorn

Puzzling out those significance levels manually is quite a pain...

JohnnyDoorn commented 5 years ago

Hi @CJAB93,

Thanks for your suggestion. Just to clarify, would you like to just be able to sort the table by significance? And how would this relate to letters?

Cheers, Johnny

CJAB93 commented 5 years ago

Hi @JohnnyDoorn ,

Thanks for your response.

What I would like is the following. If I analyse my data with JASP (ANOVA followed by Tukey post-hoc test) I get the following: afbeelding

If I perform the same analysis using R, I get two things that I like: asterisks showing significance level, and letters showing which treatments differ from each other:

afbeelding afbeelding

This way, I can see within 5 seconds that treatment 5 and 7 differ from 1 (the control), but that 2, 3, 4 and 6 do not. Additionally, I can see at once that 5 and 7 do not differ from each other, but 7 differs from 1, 2, 3, 4 and 6, while 5 does not differ from 4 and 6.

These letters are very useful for reporting significance data in a graph:

afbeelding

I would like it if JASP could report those letters, so that I can report those letters in Excel-generated graphs and tables for example. If you have many treatments, like in this example, it is virtually impossible (at least quite confusing and time-consuming) to 'calculate' these letters by hand. As a result, me and people I recommended JASP to, switch to other software to perform the ANOVA and Tukey since we use those letters quite often. For me R is an option, but we all really like the clean and straightforward interface of JASP so we would like to use it for all our analyses.

I hope these examples clarify what I would like.

JohnnyDoorn commented 5 years ago

Thanks for explaining! One question still: It seems that those groups are automatically created - what package is it that outputs those groups and what is the grouping based on?

CJAB93 commented 5 years ago

@JohnnyDoorn Sorry forgot to respond! The R-package that creates those letters is multcompView. (multcompView::multcompLetters). The grouping is based on the (non-significant) p-values...I do not know how to explain it better than in the above post(s). I suppose researching the package would help you better than my explanations (https://cran.r-project.org/web/packages/multcompView/multcompView.pdf).

Example 1: If treatment 1 differs significantly with tr3, and tr2 differs with tr3, but tr1 and tr2 don't differ from eachother, tr1 and tr2 get an "A" while tr3 gets a "B". If there is a tr4 which differs from tr1 AND with tr3, but NOT tr2, it gets a "C" and tr2 gets a "C" as well, so it will have "AC"...

Example 2: If you look at the above post at the $comparison table, you'll see that tr 1, 2, 3, 4 and 6 do not differ significantly in all their possible multiple comparisons. So they get the same letter ("a"). However, 1, 2 and 3 DO differ with 5, while 4 and 6 do not differ with 5. So 5, 4 and 6 get the same letter as well, ("b"). Etc.

So the grouping is based on the p-values, each group consisting of treatments that all have no significant differences between themselves. This group gets the same letter. But members can be in different groups at the same time, gaining multiple letters.

CJAB93 commented 5 years ago

@JohnnyDoorn Any progress/insight? :)

JohnnyDoorn commented 5 years ago

Hi @CJAB93,

Thanks for the reminder! I just finished rewriting the R-code of the ANOVA analyses, and your suggestion was on my list for things to add. However, when implementing it, I noticed that the whole grouping method is based solely on the p-value. After discussing it with some team members, we are not sure if we want to implement this method, as it can very quickly get misleading. I would like to add some functionality to increase the overview of results though, so I implemented a checkbox that marks significant results (at 0.05, 0.01 and 0.001) so that it is still very easy to see where the significant differences are. Another method could be to include some sort of grid plot, that pits each group against the other groups, and that it highlights were there are high/low negative/positive effect sizes. Do you think something like these options would suffice?

Kind regards, Johnny

idhussain commented 4 years ago

@JohnnyDoorn Please include this. It saves time; one can better understand ranking of means. Please see below two presentations (as displayed above) and decide which one is easy to understand:

  1. https://user-images.githubusercontent.com/49509367/60804586-27f79900-a17e-11e9-8e75-0ea08f376d7a.png
  2. https://user-images.githubusercontent.com/49509367/60804687-55444700-a17e-11e9-81fa-796af32e2e41.png

These two presentions tell which pairs are significantly different. But, #2 is very easy to understand (although, #1 has some good information). In some cases, however, only #2 is required.

I use statistix. It is very easy there. Do you need output of statistix?

Include this feature. If you add this, hundreds of students at my university may switch from statistix to JASP!

EJWagenmakers commented 4 years ago

I am not sure #2 is easy to understand. I had to reread the explanation above. I do agree with you that some more structure would be nice -- for instance, I like Johnny's checkbox, but maybe more could be done. I particularly like the idea of creating some sort of figure to show this information. For instance, a matrix-style plot would be nice, with p-values on the panels above the diagonal and the jittered boxplot of the two relevant conditions on the corresponding lower panels.

idhussain commented 4 years ago

@EJWagenmakers @JohnnyDoorn

I have two points to discuss.

  1. It is a lot easier to understand #2 (https://user-images.githubusercontent.com/49509367/60804687-55444700-a17e-11e9-81fa-796af32e2e41.png). Let me explain why. Consider the 7 treatments in the example are 7 different fertilizers that are being compared for crop yield (the response means). What fertilizer(s) can we suggest based on statistical comparison? By looking at #2 one can easily understand that treatment means corresponding to fertilizers 3, 2, 1, 4 and 6 are statically similar to each other (as each marked by the letter "a") but significantly higher than fertilizer 7 (as marked by letter "c"). The mean of fertilizer 5 is non-significantly different than means of fertilizers 4 and 6 on the upper end, and fertilizer 7 on the lower end. Hence, one can choose from fertilizers 3, 2, 1, 4 and 6.

Yes, # 1 can also answer the fertilizer ranking question mentioned above (but it will take more time).

  1. This presentation style (#2) is widely followed in scientific articles in the field plant sciences. For example, please see the figures in this latest paper https://www.frontiersin.org/articles/10.3389/fpls.2020.00529/full.

Maybe, I am unable to write what I mean. Please ask me if the above text is hard to understand or you have any other question.

CJAB93 commented 4 years ago

"Coincidentally" I am also active in the field of plant science. In plant science this way of summarizing the results of statistic tests is extremely prevalent (by which I mean almost omnipresent). So prevalent that I took it for granted that every scientist is familiar with this way of reporting statistic differences. A few months back I was already quite surprised that I had to explain it to Johnny but now that EJWagenmakers is also suggesting that she needed explanation, I am convinced it is simply a 'plant science vs mathematical/statistical (?) science' thing where each field has its own conventions.

That is why the lack of it makes JASP a lot less interesting for scientists active in the plant science field since it takes quite some time to generate the significance 'levels' (with a, b, etc) by hand.

Figure 3 and 4 in the paper mentioned by idhussain are nice examples of how these letters are used to report significant differences of a multiple comparison in a single glance (if you are aware of the conventions regarding these letters, of course).

JohnnyDoorn commented 4 years ago

Hi @CJAB93 and @idhussain ,

Back at the time, I already wrote some code to do this, so I will take a look at it again and see if this can be in the next major release!

Cheers, Johnny

idhussain commented 4 years ago

Hi @JohnnyDoorn Thank you for this. The letters are based on "critical mean difference for comparison" (StatistiX calculate it). In example # 2 (https://user-images.githubusercontent.com/49509367/60804687-55444700-a17e-11e9-81fa-796af32e2e41.png), the critical mean difference for comparison seems to be around 5.5 (although, exact would be calculated by Tukey's HSD test). If the difference between any pair of means is more than 5.5, they will be considered statistically different and different letters (say "a" and "b") will be assigned (and vice versa). I hope I have conveyed it.

idhussain commented 4 years ago

@EJWagenmakers @JohnnyDoorn @TimKDJ

Any luck with the addition of this feature?

Regards, Shahid

idhussain commented 4 years ago

@EJWagenmakers @JohnnyDoorn @TimKDJ

I request you to please add this feature. Following a procedure on how it is done with R (In post hoc test, we need letter-based raking of means).

library(agricolae) data(sweetpotato) model<-aov(yield~virus, data=sweetpotato) out <- LSD.test(model,"virus", p.adj="bonferroni") out $statistics Mean CV MSerror LSD 27.625 17.1666 22.48917 13.4704

$parameters Df ntr bonferroni alpha test name.t 8 4 3.478879 0.05 bonferroni virus

$means yield std r LCL UCL Min Max cc 24.40000 3.609709 3 18.086268 30.71373 21.7 28.5 fc 12.86667 2.159475 3 6.552935 19.18040 10.6 14.9 ff 36.33333 7.333030 3 30.019601 42.64707 28.0 41.8 oo 36.90000 4.300000 3 30.586268 43.21373 32.1 40.4

$comparison NULL

$groups trt means M 1 oo 36.90000 a 2 ff 36.33333 a 3 cc 24.40000 ab 4 fc 12.86667 b

TarandeepKang commented 8 months ago

Hi @JohnnyDoorn I must agree with the others that some additional representations of (especially very many) multiple comparisons could be very useful! Although whether CLDs as I think they're called, are the optimal solution, remains open to debate, as you suggest above. Several, seemingly more flexible implementations are available in emmeans and the maintainers suggest some improvements over the default at the bottom of the linked page. I would be glad to hear what you think! See below:

https://rdrr.io/cran/emmeans/man/CLD.emmGrid.html

tomtomme commented 6 months ago

@idhussain @CJAB93 @TarandeepKang This is now in progress here: https://github.com/jasp-stats/jaspAnova/pull/321

idhussain commented 5 months ago

Is this function now available in the latest version available for download?

On Wed, Apr 24, 2024 at 6:06 PM Johnny van Doorn @.***> wrote:

Closed #342 https://github.com/jasp-stats/jasp-issues/issues/342 as completed via jasp-stats/jaspAnova#321 https://github.com/jasp-stats/jaspAnova/pull/321.

— Reply to this email directly, view it on GitHub https://github.com/jasp-stats/jasp-issues/issues/342#event-12591659486, or unsubscribe https://github.com/notifications/unsubscribe-auth/APQ4TEWAKNK6UOSAH2MDGA3Y66U6HAVCNFSM4HFDFB32U5DIOJSWCZC7NNSXTWQAEJEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW4OZRGI2TSMJWGU4TIOBW . You are receiving this because you were mentioned.Message ID: @.***>

tomtomme commented 5 months ago

This is in 0.19beta available in a few weeks as 0.19final

JohnnyDoorn commented 5 months ago

@idhussain if you want to check out the beta version you can download the latest version here (depending on your OS) image

idhussain commented 5 months ago

Got it. I will try to use. Thank you.

On Tue, May 21, 2024 at 2:55 PM Johnny van Doorn @.***> wrote:

@idhussain https://github.com/idhussain if you want to check out the beta version you can download the latest version here https://static.jasp-stats.org/Nightlies/ (depending on your OS)

— Reply to this email directly, view it on GitHub https://github.com/jasp-stats/jasp-issues/issues/342#issuecomment-2122237436, or unsubscribe https://github.com/notifications/unsubscribe-auth/APQ4TET6MBMUFCL7J4JON3TZDMKZDAVCNFSM4HFDFB32U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGIZDGNZUGM3A . You are receiving this because you were mentioned.Message ID: @.***>

idhussain commented 4 months ago

Thank you for this. I am using it now. Once the regular version is released, I will share that with my students.

On Tue, May 21, 2024 at 2:55 PM Johnny van Doorn @.***> wrote:

@idhussain https://github.com/idhussain if you want to check out the beta version you can download the latest version here https://static.jasp-stats.org/Nightlies/ (depending on your OS)

— Reply to this email directly, view it on GitHub https://github.com/jasp-stats/jasp-issues/issues/342#issuecomment-2122237436, or unsubscribe https://github.com/notifications/unsubscribe-auth/APQ4TET6MBMUFCL7J4JON3TZDMKZDAVCNFSM4HFDFB32U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGIZDGNZUGM3A . You are receiving this because you were mentioned.Message ID: @.***>