jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
58 stars 29 forks source link

Descriptive Plot in RMANOVA plots implausible measures #1280

Closed PelleLovesPeace closed 3 years ago

PelleLovesPeace commented 3 years ago

Hello,

I have conducted a 5x3 RMANOVA in Jasp (Version 0.13.1. under MacOS Big Sur 11.3) with error rates (min=0%; max=100%). When looking at the Descriptives, I get mean error rates between 25%-49% for every dependent variable (i.e. column).

After going through the statistics for the RMANOVA (which seemed plausible) I also made a descriptive plot here (3 factor levels on horizontal axis; 5 factor levels as separate lines). Now it looked as if some of the error rates were much higher (around 70%) and not in accordance with the descriptives calculated before.

I tried to replicate this case with simulated data and it seems that this visualization problem occurs only if there is a specific number of factor levels. Could this be some kind of visualization bug, or is the measure that is visualized in the plot simply not the mean or the median of each column?

Here is the simulated data & a screenshot of the descritpive plot from the RMANOVA and the Descriptives. TestRmanova.csv

TestRmanova

Thanks in advance for your help, Jelena

JohnnyDoorn commented 3 years ago

Hi @PelleLovesPeace,

If I understand correctly, you get a different plot in the RM ANOVA descriptives plot, compared to the descriptives plot from the descriptives module? If so, I cannot replicate this on the latest version of JASP (0.14.1) - see screenshot below. Perhaps you can try upgrading your JASP version, although I am not sure what could be going on.

Cheers Johnny

Screenshot from 2021-05-03 12-55-48

PelleLovesPeace commented 3 years ago

Hi @JohnnyDoorn

thanks for the quick reply! I've updated to the newest version now. But the problem is something other than what you've described and is still there: In the table for the descriptives the mean values are lower (& plausible) than what is shown in the descriptive plot (implausible). For example: mean(A2B1) = 42.138 and in the plot it looks like it's something around 50. Therefore I was wondering, whether what is show in the plot is not the mean or if this could be some kind of bug in the visualization.

JohnnyDoorn commented 3 years ago

Hi @PelleLovesPeace,

Ah I think I see. This is a tricky thing with RM ANOVA: the way that the cells are assigned to the factorlevel combination. If you look at my screenshot again, you'll see that I also made that mistake. For instance, (A3,B1) is mapped to (A1,B3). I redid the analysis with the correct assigning, see the plot below: Screenshot from 2021-05-03 16-25-42

PelleLovesPeace commented 3 years ago

Hi @JohnnyDoorn

well, you're absolutely right about that! And for this simulated data it really seemed to solve the problem! I have now double-checked if this was also the problem for the "real" data. But here there seems to be yet another problem.

Sorry to "spam" with the unhandy variable names but I think this might be the easiest way to show you what I mean: So this is the plot for my error rates

Bildschirmfoto 2021-05-03 um 16 45 59

Apparently, some of them are very high, around 60%....... And this now are all the descriptives

Bildschirmfoto 2021-05-03 um 16 46 44

Here, the highest mean value achieved is 49%..... So even if I had made the mistake of assigning the variables wrong, such high values should not actually appear in the plot.

JohnnyDoorn commented 3 years ago

Hi @PelleLovesPeace,

Are you able to share this data? That way I can take a closer look myself. You can also email it directly to me at j.b.vandoorn uva.nl. Alternatively, you can try updating your JASP version, because it might contain some improvements that would fix this issue (this is just a guess though).

Kind regards, Johnny

JohnnyDoorn commented 3 years ago

Hi @PelleLovesPeace

I am replying here for visibility. I see now that there is quite some missingness in your data, which is no problem, but is causing different behavior in the RM ANOVA module and in the descriptives module:

This also means that your whole RM ANOVA analysis is based only on those 4 observations. Dealing with missingness through imputation is something that is still on our todo-list, unfortunately, and also not as clearcut as just providing a RM ANOVA based on listwise deletion.

Kind regards, Johnny

PelleLovesPeace commented 3 years ago

Hi @JohnnyDoorn

aaah I see. I was hoping that the missing data wouldn't be causing so much trouble. Your explanation however seems totally reasonable.

Thank you very much, for solving this mystery!

Kind regards, Jelena