trevismd / statannotations

add statistical significance annotations on seaborn plots. Further development of statannot, with bugfixes, new features, and a different API.
Other
639 stars 71 forks source link

Add bootstrap statistical tests #4

Closed JosephLalli closed 3 years ago

JosephLalli commented 3 years ago

Hi Florian,

I hope I'm doing this right! This first pull request is focused on adding a bootstrapping statistical test. The user supplies 'n_bootstraps' to add_stat_annotation function, and can either specify 'bootstrap' or 'paired_bootstrap' as the test.

In tests.py, I have implemented these two tests in the if/elif switch. I have also added custom functions that implement these tests. If you know of a third party package that implements these better than I have, I'm happy to use that package instead.

'bootstrap' draws n_bootstraps samples with replacement from box_data_a and box_data_b. Each sample is of the same size as the originating dataset. The mean value is calculated for each sample. I compare samples_a>samples_b, which returns 1 if true and 0 if false. The average of this comparison is the percentage of samples for which the comparison is true. I do this for a < b, take the min value, and double it (because we are performing a two-sided test). 'paired_bootstrap' does the same thing, except that we calculate sample_a - sample_b and determine the number of times the difference is greater or less than 0.

I have tried to add appropriate documentation. I have supplied a dataset from a project I am about to publish that uses statannot throughout, and added examples to the example.ipynb file to show that these functions work and are more highly powered than Mann-Whitney nonparametric tests. In doing so I regenerated your example pngs, so those 'changes' are not real changes.

You can expect several more pull requests like this in the upcoming days.

Thanks, Joe Lalli

JosephLalli commented 3 years ago

Also, I nearly forgot, one of my examples uses a hue value, and dodges the hues. This is because I like to just create a kwarg variable 'fig_args' and give the kwargs to both seaborn and statannotation. This works well most of the time, but I was getting an error due to 'dodge' not being a kwarg for add_stat_annotation.

Statannotation generates a figure w/ a hue using 'dodge=True' as default, so I have just added dodge as a potential kwarg and set the default to True.

trevismd commented 3 years ago

Hi Joe, As you can see, I had some time available now and integrated your contribution on axes coordinates.

I see the value of the PR, of course, but also two problems:

It could however be a good fit for using the possibility of working with other functions than those already supported.

Your functions could be made available in several ways:

What do you think?

Also, to track your contribution, you are welcome to submit a separate PR for the dodge parameter so you can get the credits. Otherwise, I'll do it and refer to this.

edit: a typo

trevismd commented 3 years ago

Hey Joe, I hope you understand why I'm closing this PR request.
Please do suggest this material in another form as I suggested above.
As example for one of the ideas, I also made a package for permutation-based statistics (permutation-stats, https://github.com/trevismd/permutations-stats), and I made a gist to show how to use it with statannotations(https://gist.github.com/trevismd/f556d83f6efdad249f995eb65daeb1d9).