jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
58 stars 29 forks source link

[Feature Request]: Mean and CI in RainCloud plots #1698

Closed Trebicky closed 5 months ago

Trebicky commented 2 years ago

Description

Let user choose whether to shop boxplot ir Mean and its confidence intervals

Purpose

When comparing means (and their CIs), I want to visualise them in a corresponding raincloud plot.

Use-case

Runnig any mean comparison test.

Is your feature request related to a problem?

When plotting data for a given anylsis, the data should be visible in the plot. Thus whne comparing means, median and percentiles don't help.

Describe the solution you would like

Let user choose whether to show boxplot or mean+CI (maybe in the graph settings dialogue?)

Describe alternatives that you have considered

Showing both the boxplot and mean+CI as a default in Raincloud plots

Additional context

I like raincloud plots, and I see the advantage in showing raw data point, their density and some central tendency with associated dispersion.

But frequently, the mean does not overlap with the median, and IQR does not represent the confidence interval.

When I run parametric tests for means differences (having normally distributed data and fulfilling all assumptions...) I want to visualise the associated statistics - means and CI. When I run non-parametric tests, I would prefer to show the median and IQR.

Let the user pick which central tendency and dispersion to show in rainclouds.

juliuspfadt commented 2 years ago

Hi @Trebicky, thanks for the request. Let's see if Don can help.

EJWagenmakers commented 2 years ago

We have people helping out with raincloud plots in the near future, so let's leave this one for them EJ

EJWagenmakers commented 9 months ago

OK, so the issue I have is that rainclouds are meant as descriptive tools, not inferential tools. So adding the mean is OK (although unclear how much that adds over the median, which is already displayed), but the CI is an inference, and it will change based on the underlying model (in case of more groups). But maybe we could have an inferential version of the raincloud, where mean and SE or CI is shown. Hmm I am not against it, but it should not be the default option.

Trebicky commented 9 months ago

By browsing those other Lister threads, other users would like to see Raincloud plots listed in descriptives (+1 from me). As said, it is supposed to be only descriptions; let's move it under Descriptives.

However, as it is now in the inferential section, having the option to show inferential information would be well received (as mentioned in multiple threads).

vincentott commented 9 months ago

I am currently working on a raincloud plots module and will soon look into also implementing means. I find it very interesting to hear that there is the demand for Means and CIs in raincloud plots! Because I myself have also been wondering to what extent it would make sense to implement them.

tomtomme commented 9 months ago

To add another perspective to the discussion - I do not see how any statistical measure or graph can be seen as "only" descriptive or "only" inferential. For example:

a) Descriptive Table with means, SDs and n's - if meanDiff and n big and SD small I already can infer big signal to noise ratio aka big t, small p. b) Same for Scatterplots or Rainclouds even without CIs: I can use them to graphically infer deviations from normality, variance homegeneity. And I also can see the signal to noise ratio roughly. At least for simple models. For multivariate stuff Flexplot does a nice job to enable inference from graphs with added variable plots.

And so it does not really matter what the inventors thought, how their creations were supposed to be used. Once the tool is out there, it will be used and misused in all ways thinkable. We can try to guide the user, but in this case I do not think it is harmful to use graphs as inferential tools, quite the opposite. They add rich information to all the t, z, F, BF, p, VSMPQ measures and enable a more robust inference.

tomtomme commented 9 months ago

To get an overview and to close very related issues - while the following is already partly available via our flexplot module - we might want via rainclouds in an extra descriptives module:

linking other open issues related to rainclouds:

vincentott commented 9 months ago

Do you think it would be a good design choice - and that users would be fine with it - if users can choose EITHER between box plots OR means with standard deviations?

I think that would make everything a lot tidier. Unless you think there is a pressing need to show both (though EJ already pointed out above that simultaneous Median and Mean would not add much), I won´t ponder about the implementation for now.

At the moment I am keeping the boxplots as they are defined: with medians, the middle 50%, and whiskers according to the 1.5 * interquartile distance. Because I assume that this is what most people will understand when they see a boxplot. And anything else might be misleading, even with a corresponding note?

The means, standard deviations (and potentially CIs) are visualized differently with a ggplot diamond shape (square rotated by 90 degrees) and outgoing whiskers.

Could it be misleading to offer standard deviations? Because even with a note they will look like CIs at first glance? Thus EITHER boxplots OR Means (+ CIs?) but without SDs?

Also, are there ways to make the rainclouds more appealing to Bayesians? What measure of central tendency do they usually use (Means or Median)?

Instead of implementing various statistical modeling under the hood in the raincloud plots module, maybe we can have the user simply specify the CREDIBLE or CONFIDENCE intervals themselves? By default there could be boxplots and if users want to visualize their statistical model, then they can do so with the appropriate input?

tomtomme commented 9 months ago

EJ said, mean and CI should not be default. I see it the same way. CIs could be an option for the user to choose. Flexplot (from visual modeling module) handels this nicely via a dropdown that shows: image

This shows

In the raincloud case the options would be a bit different (according to the issues send in from all the jasp users). The following order from top to bottom in the dropdown might be:

raincloud with a

But I do not think that the order is really important. The important thing is, that the old raincloud with standard boxplot should be default, and that the user has options to change that. Also a way to change the 95% to SE or 99% or something else would be sensible and in line with JASPs options in other modules - but this may be - at the same time - to much for a module under descriptives. And it is debatable if we even need to include all 4 options. I think it would be cool, because choice is good and I know my way around. Beginners might be overwhelmed. And who am I to decide. You are implementing the module. I am just communicating the issues I found here.

And there are also lots of other improvements you could copy over from flexplot, like the sliders for dot jitter, dot transparency, or even dot thickness - to once discuss the other parts of the raincloud. And same for the density plot - there could be options to show a histogram or even a bar plot too, depending on the type of data and then you might wanna change bar width and color and all. So out there is a myriad of options to tackle the raincloud, and I bet you will choose wisely which to implement, since this is what JASP devs have proven in the past. All over JASP I see a good balance between "keep it simple" and "give the user options".

Trebicky commented 9 months ago

The option to choose Boxplot or Mean + [a measure of dispersion] would be great, solving most requests.

I see that the general inclination is to keep the plot as a descriptive tool only. However, as the rain clouds are inside an inferential test module, people will want to use them to visualize the inference they are making within the ANOVA (or whatever means comparisons test).

Providing SD is nice, but not an inferential error measure... 95% CI is the current expectation.

If it is supposed to stay descriptive, move it into the Descriptives module...

Trebicky commented 9 months ago

+1 for Toms post above.

EJWagenmakers commented 9 months ago
  1. We do have descriptive plots to accompany the inferential tests, and for good reason. For instance, when you test a correlation (inference) the user also sees a scatterplot (description), because the scatterplot provides a visual check on whether the inference makes sense. I would not want to relegate all descriptive plots to Descriptives, because it makes it easier for people to forget to do some fundamental data inspection. See Anscombe's quartet for my favorite demonstration.
  2. One immediate issue is that if you want a box-plot style inferential plot, it would ideally be sensitive to the underlying model. I can see it being useful, but I have not encountered it before. I wonder whether there is a good reason for this. Suppose I show a boxplot or the individual data points, and then put a mean and some interval in there as well. It is natural to think of the interval as a property of the spread of the dots. However, when the interval is an SE or 95% CI, it is actually a property of the mean of the dots. So the individual dots can be widely varying, but if there are a sufficient number of them the CI will nonetheless be very narrow. This may cause confusion. And we already allow plots of the 95% CIs, just not with the individual data added on to it. So I can see @vincentott 's point. I also think we can delay this decision. Let's first finish the regular raincloud plot, and keep discussing the value of the additional options. Choice is good, but confusion is bad. The case would be more compelling if the "raincloud/boxplot + inferential measures about the mean"-plot was already in common use -- but I have not seen it.
vincentott commented 8 months ago

I am currently implementing the following options:

Mean only Mean ± 1 standard deviation Mean ± 1 standard error of the mean

Mean with confidence interval (see jaspDescriptives module) -> option width: 95% -> option method: normal model, t model, bootstrap

Potentially I will have the time to implement the option to specify custom interval limits. Because at the moment, the confidence interval assumes that all cells of the design are independent of each other (only between factors).

The default will still be boxplots and they will be default boxplots (i.e., whiskers, 50% box, median).

tomtomme commented 5 months ago

@Trebicky This is now available via 0.19 beta. If you want to test the new functionality pre-release, see: https://static.jasp-stats.org/Nightlies/ Cheers :)

Trebicky commented 5 months ago

🤩

Dne po 27. 5. 2024 11:30 uživatel Thomas Langkamp @.***> napsal:

@Trebicky https://github.com/Trebicky This is now available via 0.19 beta. If you want to test the new functionality pre-release, see: https://static.jasp-stats.org/Nightlies/ Cheers :)

— Reply to this email directly, view it on GitHub https://github.com/jasp-stats/jasp-issues/issues/1698#issuecomment-2133063793, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANG47BBNHXKS6OYUMW45HB3ZEL4KHAVCNFSM5U3FELA2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJTGMYDMMZXHEZQ . You are receiving this because you were mentioned.Message ID: @.***>