trevismd / statannotations

add statistical significance annotations on seaborn plots. Further development of statannot, with bugfixes, new features, and a different API.
Other
624 stars 67 forks source link

Non bonferroni comparisons corrections have multiple significances? #30

Closed soorajachar closed 2 years ago

soorajachar commented 2 years ago

Hello thank you for this updated version of statannot it's been very useful. I am currently having an issue with non-bonferroni multiple comparisons corrections; they seem to give me two significances like this (with the holms-bonferroni; the "* (ns)" label is what seems strange to me):

Screen Shot 2021-09-06 at 11 23 34 AM

vs. the bonferroni correction, which only gives me a single significance for each comparison:

Screen Shot 2021-09-06 at 11 25 02 AM

Strangely I do get this warning for using holms-bonferroni, but not with bonferroni alone:

/opt/anaconda3/lib/python3.7/site-packages/statannotations/_Plotter.py:338: UserWarning: Invalid x-position found. Are the same parameters passed to seaborn and statannotations calls? or are there few data points?
  "Invalid x-position found. Are the same parameters passed "

Thanks in advance.

trevismd commented 2 years ago

Hello @soorajachar, Thank you for your comment and your questions.

About the annotation produced

I hope it is now at least more clear. Please tell me if it is not the case, if there is a possible bug, or if you'd prefer things to be handled differently in some way.

About the warning message

This warning is not linked to the correction method. It seems that you are overlaying the bars with data points, but each bar only has 3 corresponding points. This is what is referred to in "or are there few data points?". You can see that the three points are not in the same x position in both charts, and they would not always be in the same place on each plot if you were to re-run them. Usually, it will not result in any problem (and the warning can be ignored), but in some cases, I suspect it could lead to weird looking brackets spanning over several bars. In these cases, re-running the plot can do the trick. The warning is there to make sure the plots are checked in some cases, such as 'batch' processing.

soorajachar commented 2 years ago

Thank you I believe that makes things more clear. To clarify one point though, does this mean with type I corrections (like holm-bonferroni, benjamin-hochberg etc.) the corrected p-value will always be either significant "" or not significant "ns"? There will no longer be levels of significance (, etc.)? If so it may be good to add a non-default option to the Annotater class to only show the corrected significance for these type I corrections (either * or ns) and discard the original significance, as it can get a bit crowded to show both at once with many comparisons, and is also a bit difficult to explain.

trevismd commented 2 years ago

No, for these types of corrections, all the thresholds configured (and shown on the star notation legend) are still used, so you could also have ** (ns), *** (ns), and so on.

Really, these methods only add these "(ns)" where appropriate.

Isn't it fairly easy to explain like this ?

The idea is to keep displaying the raw p-value. I guess there could be an option to report a corrected one and have the same behaviour as with "Bonferonni", although to me we lose information in favor of little more clarity.

soorajachar commented 2 years ago

Right I meant that the Type I corrections would never have anything besides an (ns) as the corrected threshold (there would never be a "* ()" for example). I agree that the (ns) is useful to have, and I think it should be the default option, but it can cause problems in terms of readability when conducting exploratory analysis with a lot of comparisons, so I think it still would be useful to have an option to turn off the original p-values and just display the ns when doing a Type I correction.

trevismd commented 2 years ago

It's a good idea, thank you for sharing your thoughts!

I think I can come up with an implementation fairly quickly. I hope you'll give some feedback then too ;-)

soorajachar commented 2 years ago

No problem thanks for listening to my feedback. Let me know if I can be of any help.

trevismd commented 2 years ago

No problem thanks for listening to my feedback. Let me know if I can be of any help.

Thanks! Would you have the time to review my proposal in #31, or just see if it works for you ?

soorajachar commented 2 years ago

Looks good to me; I think adding the option for custom formatting of the "ns" string is an especially good idea as different journals have different formatting standards for reporting multiple comparisons corrections.

liviu- commented 4 months ago

The idea is to keep displaying the raw p-value. I guess there could be an option to report a corrected one and have the same behaviour as with "Bonferonni", although to me we lose information in favor of little more clarity.

Is it not possible to show both the raw and the adjusted pvalues at least when using verbose output (not in the figure)? The corrected pvalue is useful to report as well.