trevismd / statannotations

add statistical significance annotations on seaborn plots. Further development of statannot, with bugfixes, new features, and a different API.
Other
619 stars 67 forks source link

nan_policy='omit' does not work in paired tests (t-test, Wilcoxon) while using Annotator.plot_and_annotate_facets #124

Open VladimirRoudko opened 1 year ago

VladimirRoudko commented 1 year ago

Stumbled over the error while dealing with Nan values in paired tests. The code:

data=df_freq correction='BH' tests=['t-test_paired','Wilcoxon'] pal = sns.color_palette(['black'], len(data[hue_var].unique())) pairs = [('C1D1','C2D1'),('C1D1','C4D1'),('C1D1','Progression')] for test in tests: kwargs = { 'plot_params': { 'x': x_var,'y': y_var,'order':order_vars }, 'annotation_func': 'apply_test', 'annotation_params' : {'nan_policy':'omit'}, 'configuration': {'test': test,'comparisons_correction':correction,'verbose':False}, 'plot': 'boxplot' }

ant = Annotator(None, pairs)
g = sns.FacetGrid(data=data,col=facet_var,col_wrap=6,height=4, aspect= 1.5,\
                  sharey=False,sharex=False,gridspec_kws={"wspace":.1,"hspace":.1})
g.map_dataframe(sns.boxplot,x=x_var,y=y_var,fliersize=0,order=order_vars)
g.map_dataframe(ant.plot_and_annotate_facets, **kwargs)
g.map_dataframe(sns.swarmplot,x=x_var,y=y_var,size=2.5,color='black',order=order_vars)
g.map_dataframe(sns.lineplot,x=x_var,y=y_var,hue=hue_var,palette=pal,linewidth=.8)
g.map_dataframe(sns.pointplot,x=x_var,y=y_var,order=order_vars,estimator='mean',errorbar=None,color='orange')
g.set_xticklabels(rotation=30,horizontalalignment='right')
plt.tight_layout()
plt.savefig(results_folder+'/CyTOF_absolute_freq_boxplot_'+test+'_'+correction+'.png',dpi=150,bbox_inches='tight')
plt.close()

gives ValueError: unequal length arrays

df_freq is dataframe with 10200 rows, gives 34 facets, each item in pairs has 75 observations (34 x 75 x 4 = 10200). Running the same dataframe on original scipy.wilcoxon or scipy.t-test_paired with nan_policy='omit' gives correct results.

Is it a bug or I need to set parameters differently?

Thank you, Vladimir

trevismd commented 1 year ago

Hi! If you want to pass arguments to the stat function, you should use stats_params, as in So, in your case with plot_and_annotate, that would be

edit: this was incorrect, the way it was used in the original post is correct. There may be a bug

VladimirRoudko commented 1 year ago

Thanks Trevis for quick reply! I tried your suggestion however I am running into another error - here is the code:

data=df_freq data_name='df_freq' correction='BH' tests=['t-test_paired','Wilcoxon'] pal = sns.color_palette(['black'], len(data[hue_var].unique())) pairs = [('C1D1','C2D1'),('C1D1','C4D1'),('C1D1','Progression')] for test in tests: kwargs = { 'plot_params': { 'x': x_var,'y': y_var,'order':order_vars }, 'annotation_func': 'apply_test', 'annotation_params' : {'stats_params': {'nan_policy':'omit'}}, 'configuration': {'test': test,'comparisons_correction':correction,'verbose':False}, 'plot': 'boxplot' }

ant = Annotator(None, pairs)
g = sns.FacetGrid(data=data,col=facet_var,col_wrap=6,height=4, aspect= 1.5,\
                  sharey=False,sharex=False,gridspec_kws={"wspace":.1,"hspace":.1})
g.map_dataframe(sns.boxplot,x=x_var,y=y_var,fliersize=0,order=order_vars)
g.map_dataframe(ant.plot_and_annotate_facets, **kwargs)
g.map_dataframe(sns.swarmplot,x=x_var,y=y_var,size=2.5,color='black',order=order_vars)
g.map_dataframe(sns.lineplot,x=x_var,y=y_var,hue=hue_var,palette=pal,linewidth=.8)
g.map_dataframe(sns.pointplot,x=x_var,y=y_var,order=order_vars,estimator='mean',errorbar=None,color='orange')
g.set_xticklabels(rotation=30,horizontalalignment='right')
plt.tight_layout()
plt.savefig(results_folder+'/Boxplot_facets_'+data_name+'_'+test+'_'+correction+'.png',dpi=150,bbox_inches='tight')
plt.close()

TypeError: ttest_rel() got an unexpected keyword argument 'stats_params'

Could it be third party package version differences - scipy for example?

trevismd commented 1 year ago

I'm sorry, please dismiss my previous comment that wasn't helpful at all. I'll try again when I have more time available. It might be a bug.I would help if you can show the complete error trace from the original code result too.

VladimirRoudko commented 1 year ago

No problem - here is the complete trace from the last error:

TypeError Traceback (most recent call last) Cell In [198], line 20 17 g = sns.FacetGrid(data=data,col=facet_var,col_wrap=6,height=4, aspect= 1.5,\ 18 sharey=False,sharex=False,gridspec_kws={"wspace":.1,"hspace":.1}) 19 g.map_dataframe(sns.boxplot,x=x_var,y=y_var,fliersize=0,order=order_vars) ---> 20 g.map_dataframe(ant.plot_and_annotate_facets, **kwargs) 21 g.map_dataframe(sns.swarmplot,x=x_var,y=y_var,size=2.5,color='black',order=order_vars) 22 g.map_dataframe(sns.lineplot,x=x_var,y=y_var,hue=hue_var,palette=pal,linewidth=.8)

File ~/work/software/homebrew/lib/python3.10/site-packages/seaborn/axisgrid.py:819, in FacetGrid.map_dataframe(self, func, *args, **kwargs) 816 kwargs["data"] = data_ijk 818 # Draw the plot --> 819 self._facet_plot(func, ax, args, kwargs) 821 # For axis labels, prefer to use positional args for backcompat 822 # but also extract the x/y kwargs and use if no corresponding arg 823 axis_labels = [kwargs.get("x", None), kwargs.get("y", None)]

File ~/work/software/homebrew/lib/python3.10/site-packages/seaborn/axisgrid.py:848, in FacetGrid._facet_plot(self, func, ax, plot_args, plot_kwargs) 846 plot_args = [] 847 plot_kwargs["ax"] = ax --> 848 func(*plot_args, **plot_kwargs) 850 # Sort out the supporting information 851 self._update_legend_data(ax)

File ~/work/software/homebrew/lib/python3.10/site-packages/statannotations/Annotator.py:848, in Annotator.plot_and_annotate_facets(self, plot, plot_params, configuration, annotation_func, annotation_params, ax_op_before, ax_op_after, annotate_params, *args, kwargs) 846 self.new_plot(ax, plot=plot, plot_params, data=kwargs['data']) 847 self.configure(configuration) --> 848 getattr(self, annotation_func)(annotation_params) 849 self.annotate(**annotate_params) 851 _apply_ax_operations(ax, ax_op_after)

File ~/work/software/homebrew/lib/python3.10/site-packages/statannotations/Annotator.py:320, in Annotator.apply_test(self, num_comparisons, stats_params) 316 stats_params = dict() 318 self.perform_stat_test = True --> 320 self.annotations = self._get_results(num_comparisons=num_comparisons, 321 stats_params) 322 self._deactivate_configured_warning() 324 return self

File ~/work/software/homebrew/lib/python3.10/site-packages/statannotations/Annotator.py:477, in Annotator._get_results(self, num_comparisons, pvalues, stats_params) 474 group2 = group_struct2['group'] 476 if self.perform_stat_test: --> 477 result = self._get_stat_result_from_test( 478 group_struct1, group_struct2, num_comparisons, 479 stats_params) 480 else: 481 result = self._get_custom_results(group_struct1, pvalues)

File ~/work/software/homebrew/lib/python3.10/site-packages/statannotations/Annotator.py:615, in Annotator._get_stat_result_from_test(self, group_struct1, group_struct2, num_comparisons, stats_params) 611 def _get_stat_result_from_test(self, group_struct1, group_struct2, 612 num_comparisons, 613 stats_params) -> StatResult: --> 615 result = apply_test( 616 group_struct1['group_data'], 617 group_struct2['group_data'], 618 self.test, 619 comparisons_correction=self.comparisons_correction, 620 num_comparisons=num_comparisons, 621 alpha=self.alpha, 622 **stats_params 623 ) 625 return result

File ~/work/software/homebrew/lib/python3.10/site-packages/statannotations/stats/test.py:74, in apply_test(group_data1, group_data2, test, comparisons_correction, num_comparisons, alpha, stats_params) 71 else: 72 get_stat_result = StatTest.from_library(test) ---> 74 result = get_stat_result( 75 group_data1, group_data2, alpha=alpha, stats_params) 77 # Optionally, run multiple comparisons correction that can independently be 78 # applied to each pval 79 if comparisons_correction is not None and comparisons_correction.type == 0:

File ~/work/software/homebrew/lib/python3.10/site-packages/statannotations/stats/StatTest.py:77, in StatTest.call(self, group_data1, group_data2, alpha, stat_params) 74 def call(self, group_data1, group_data2, alpha=0.05, 75 stat_params): ---> 77 stat, pval = self._func(group_data1, group_data2, *self.args, 78 {self.kwargs, **stat_params})[:2] 80 return StatResult(self._test_long_name, self._test_short_name, 81 self._stat_name, stat, pval, alpha=alpha)

File ~/work/software/homebrew/lib/python3.10/site-packages/scipy/stats/_axis_nan_policy.py:502, in _axis_nan_policy_factory..axis_nan_policy_decorator..axis_nan_policy_wrapper(failed resolving arguments) 500 if sentinel: 501 samples = _remove_sentinel(samples, paired, sentinel) --> 502 res = hypotest_fun_out(*samples, **kwds) 503 res = result_to_tuple(res) 504 res = _add_reduced_axes(res, reduced_axes, keepdims)

TypeError: ttest_rel() got an unexpected keyword argument 'stats_params'

trevismd commented 1 year ago

Hello Vladirmir, I meant the first one actually, as my previous answer was incorrect. But I think I found the problem. Is it possible that you have different python environments with different versions of scipy? The nan_policy is a newer parameter, could it be that you don't have it where you use statannotations? I suspect you can't .apply_test('Wilcoxon', nan_policy='omit') outside of a FacetGrid without the error either?

VladimirRoudko commented 1 year ago

HI,

I have scipy the latest stable version: 1.10.1. the statannotations package is version: 0.5.0.

Interestingly the suggestion you gave me to try outside FacetGrid worked:

annotator = Annotator(g, pairs, **hue_plot_params) annotator.configure(test="Wilcoxon",verbose=False).apply_test('Wilcoxon', nan_policy='omit')

These two lines of code didn't give any errors. Could it be that within FacetGrid I have to apply different syntax to pass nan_policy parameter?