trevismd / statannotations

add statistical significance annotations on seaborn plots. Further development of statannot, with bugfixes, new features, and a different API.
Other
624 stars 67 forks source link

Kruskal-Wallis with Posthoc Dunn implementation #39

Open sepro opened 2 years ago

sepro commented 2 years ago

I noticed that the implementation of the Kruskal-Wallis test is somewhat different from what is typically seen in literature (at least in my field).

Statannotations applies a pairwise test on different groups (note that it is not a paired test which is shown in the printed output). Commonly the Kruskal-Wallis test is done on all groups to check if there is any difference and then which differences there are is picked up with a Posthoc Dunn test. I wrote some code earlier today with a Python implementation, this can be found here, note the difference in the results.

Is there an interest to implement some of these Posthoc tests? Scikit-posthocs has the stats needed. At first glance this can likely be added to Annotator._get_results or by creating a PosthocAnnotator class that does something along the lines of my code and pipes that into the Annotator. I could give that a shot if there is sufficient interest.

trevismd commented 2 years ago

Hello! Thank you for your message!

You are right on several points, here are my thoughts:

  1. There is a mistake in the output message of the test, it is indeed for independent samples. You may submit a PR or I can fix that quickly, as you prefer.
  2. This test, as it is currently supported by statannotations should only be applied on two samples, no more. This is not explicit in the code/documentation and maybe it should be for the less statistically-inclined users (could be in the same PR). It is basically the reason why a chi-square is not yet implemented (see discussion in #32). Statannotations would probably benefit from these "cross-multi-samples" omnibus tests, as I suggested there, and you are one to confirm the interest in this.
  3. Of course, post-hoc tests will then be interesting (it is a relatively big project for the package). To answer your question about implementing those, I'd it could be useful already to people who calculated an omnibus test than want to annotate using a posthoc test that. However, it would likely be more useful if available as a complete feature together with the initial test (ANOVA, KW, etc). If I read correctly, you may also prepare a function to be used as statannotations test function, to perform all the steps in your code, such as the example here: https://github.com/trevismd/statannotations/blob/bbab40825731ba3387d3aae3cf3adce0debb5b03/statannotations/stats/StatTest.py#L9. Maybe a repository of additonal custom functions would be appreciated by the community? Additionally, we should be mindful of managing the dependencies. For example, Dunn's test is on the roadmap for scipy's development, and it would be easier not to add scikit to our mix. That said, if the rest of the features are developed and scipy is not ready, we would of course use it. As for the rest of the codebase, statannotations code should therefore not depend much on the specific implementation.

What do you think ?

sepro commented 2 years ago
  1. yes, this one should be tackled. It is an easy fix, just the message that needs to be corrected. I've created a small PR correcting the message.
  2. Thomas Wiecki wrote somewhere in a PyMC3 tutorial "Most people doing statistics aren't statisticians" (that applies to me as well). I think it would be fair to show a warning in case Kruskal-Wallis is used with more than two groups, you can expect people to use this library without an in depth knowledge of the underlying statistics.
  3. Few things to unpack here:
    • Kruskal-Wallis + Dunn is one example, ANOVA + TukeyHSD and others indeed be useful additions. These are available from scikit_posthocs.
    • These cannot be implemented as the current tests in StatTest.py as the current tests expect exactly two groups to compare, not the full set of groups to be compared (which would be a pre-requisite for KW, ANOVA, ...). _get_results runs all (or all selected) pairs and stores the p-values if I see it correctly. I think an option to include post hoc tests is to adapt _get_results to have it check if a postdoc test is required, if so run this style of test (which would need to be added) if not run the current implementation.
    • Despite the name scikit_posthocs does not depend on scikit-learn, it's dependencies are pandas, numpy, matplotlib, seaborn, scipy, and statsmodels . So with the exception of the latter I think every project running statsannotations will already need those.
    • It depends also which direction you want to go with this project. I can also see another package wrapping this one specifically to do posthoc tests or simply adding a few examples (like the one I made) in the documentation/tutorial. In which case there is no need to include this here.
trevismd commented 2 years ago

Hello, So, in the same order,

  1. Thanks for taking care of that, I'll tweak a few things and it should be merged soon.
  2. Agreed, but it should be a generic message that can be shown for all tests with a specific attribute set.
  3. Point 3: 1/2: I meant the pariwise posthoc tests such as Dunn's test can be performed with statannotations. I'll make a gist to show you. So if you ran KW with another tool, and then you wanted to annotate pairs on chart with Dunn's test, you can. 3: I wrote scipy but the same applies to scikit_posthocs (although much smaller), i.e. if you don't copy the code (bad idea), it's an additional dependency. (note, statsmodels is already an optional dependency to have multiple comparisons correction support, so indeed we could do the same for scikit_posthoc for some tests available only there.) 4: I definitely would like to have omnibus tests and posthoc options available here. Just have to find some time to make it right. That's why I see the examples repo as a temporary solution to make these features available sooner.
trevismd commented 2 years ago

Point 1 covered in #40/#42, thanks!

rorraro commented 2 years ago

Guys, molecular biologist here impressed by the huge work you all guys are doing. My project requires using the same code for repeated experimental variations and automatically annotating significances in sns plots following KW+Dunn's from 3 or more groups is exactly what I am looking for (KW +Bonferroni does not really gear well statistically, apparently...).

So really looking forward if you guys come up with an user-friendly solution, of the style ''...test='Kruskal', comparisons_correction="Dunn's"..." fro 3 or more groups, for guys like me with very limited python expertise.

Thanks again for all your effor!

sepro commented 2 years ago

I can't take credit for the work on statannotations, but I'm happy to help where I can as this is a feature that would be used all over imho.

Have been adding some examples how to do things you often see in papers on my blog under the tag Code Nugget especially for people with some but not a lot of experience. Will add the combination of KW and Dunn and statannotations there as well probably with ANOVA and TukeyHSD too (eventually, we just had a baby so time is limited right now)

rorraro commented 2 years ago

That is very kind of you sir. I will be eagerly waiting for you to release this Kruskal+Dunn for 3 or more with statannotation. And my best wishes for the newborn member of the family!

sepro commented 2 years ago

Finally got around writing up a post about how to use the current version of statannotatons with scikit-posthocs. http://blog.4dcu.be/programming/2021/12/30/Posthoc-Statannotations.html if there are no short-term plans to support post-hoc tests, it might be an option to include these as an example in the documentation.

rorraro commented 2 years ago

Many thanks Sepro! Very nice tutorial. Is a bit over my current level of PyThon, but I am sure following your steps I will be able to make it through.

Very, Very Very grateful to you. You are a legend, sir!

Looking forward the day someone will make this one-liner code package, as you point out in your conclusion.

rorraro commented 2 years ago

Unfortunately I haven't been able to tackle the posthoc_dunn() command right... i am trying to understand why.

"LCRsHamburgDMW" is my df, "Disease" are my categories and "SUM all Bands" are the numerical values that I want to plot agains "Disease" categories.

However when I create this following your tutorial I get an error.

I attach two images of the Screenshot 2022-02-08 at 14 24 12 Screenshot 2022-02-08 at 14 24 26

sepro commented 2 years ago

The argument group_col in the function posthoc_dunn should be set to "Disease" .

The variables DiseaseHZDMW and dataHZDMW are created for the Kruskal-Wallis test, you don't need them later on (you try to reference the former but this is incorrect as you should specify a column in the original dataframe).

rorraro commented 2 years ago

Many thanks @sepro, that worked beautifully!!

I´ll keep the pathway. Hopefully no more blocks because of my lack of expertise.

Cheers!

rorraro commented 2 years ago

Your magic worked, Sebastian! Thanks so much! Now I have another challenge for the future. Trimming the data to only significant results. haha

Screenshot 2022-02-08 at 17 07 38 !

rorraro commented 2 years ago

I think got it creating a molten_df_trim like

molten_df_trim = molten_df[molten_df["value"] < 0.05]

would that be conceptually correct?

Screenshot 2022-02-08 at 17 19 11

sepro commented 2 years ago

yes, that is the way to do this. Do mention that only significant results are shown, but all pairwise combinations were tested (e.g. in the figure caption) as the number of comparisons does affect the correction for multiple testing.

rorraro commented 2 years ago

OH yes, good point! I´ll remember that

By the way, I am also trying to eliminate the annotation pairs that I do not need because they are comparisons that do no make sense in my study.

I have come with this solution. My understanding of python is very low, so I know this is not an elegant solution, but I think it worked just in case anyone else needs it. it comes after "molten_df" has been stablished.


Reset index

molten_df_ri=molten_df.reset_index() molten_df_ri

Create a list of "non-useful" pairs and trim the list removing rows with unuseful pairs

nonuseful=[here the row numbers with the pairs that are not neededseparated by coma] molten_df_ri_trim = molten_df_ri.drop(molten_df_ri.index[nonuseful]) molten_df_ri_trim

Eliminate non-significant pairs

molten_df_final = molten_df_ri_trim[molten_df_ri_trim["value"] < 0.05] molten_df_final


"molten_df_final" will go in here as provided by @sepro

--

pairs = [(i[1]["index"], i[1]["variable"]) for i in molten_df_final.iterrows()] p_values = [i[1]["value"] for i in molten_df_final.iterrows()]

--

And here are the results

Screenshot 2022-02-08 at 18 00 43

sepro commented 2 years ago

This discussion is getting a little far off topic, could you send me an email? There are contact details on my blog.

rorraro commented 2 years ago

Thanks Sebastian! I apologise. I thought it could be useful for newies like me.

Over and out!

sepro commented 2 years ago

Create a list of "non-useful" pairs and trim the list removing rows with unuseful pairs nonuseful=[here the row numbers with the pairs that are not neededseparated by coma] molten_df_ri_trim = molten_df_ri.drop(molten_df_ri.index[nonuseful]) molten_df_ri_trim

Comparisons that don't make sense should be excluded before correcting for multiple testing. So the quoted part is incorrect. There are ways to do this, though I don't think this is the place to discuss this as this is far beyond the scope of statannotations.

rorraro commented 2 years ago

Many thanks, Sebastian! You have been extremely helpful!