How to plot facet grid with hue argument?

JohannesWiesner commented 9 months ago

Hi! I am not sure if I am just doing something wrong that's why I am opening another issue here. However, this is related to #120.

I would like to use a FacetGrid plot to give my figure more structure. Specifically, I have a data frame where I would like to test for receptor expression differences between brain regions of interest and non-regions of interest (using the hue-argument). I would like to do this for n different receptors (r1, r2, ... rn). On top, receptors can be assigned to more broad receptor groups which should be visualized as different subplots within my facet grid (one column for each receptor group, but a maximum of 2 columns). See this beautiful hand-made preview:

github

And here's my dataframe:

expression.csv

Is it possible to achieve this with statannotations? I am not sure, because the example does not include the hue-argument and I am not sure if this creates problems. I tried

annot = Annotator(None, pairs)

g = sns.FacetGrid(expression_long, col='receptor_group', height=12, sharey=False)

plot_params = {'x':'expression',
               'y':'receptor',
               'hue':'roi',
               'hue_order':['roi','non-roi'],
               'orient':'h'}

pairs = [((receptor,'roi'),(receptor,'non-roi')) for receptor in expression_long['receptor'].unique()]

g.map_dataframe(annot.plot_and_annotate_facets,
                plot='boxplot',
                plot_params=plot_params,
                configuration={"test": "Mann-Whitney"},
                annotation_func="apply_test")
plt.show()

but this gives me:

ValueError: Missing group valueCHRM1in receptor (specified inpairs)

trevismd commented 9 months ago

Yes, statannotations works well with the hue argument in FacetGrid too but pairs are defined at plot level, so the xand hue should be the same across plots, which is not the case for you here. The "today" solution for you would perhaps be to define subplots to create your desired layout and then use the "regular" plot + stannanotations on each subplot as you'll have different pairs to compare in each one. (See this post https://www.statology.org/seaborn-subplots/) Something like this:

annot = Annotator.get_empty_annotator()
plot_params = {
    'x':'receptor',
    'y':'expression',
    'hue':'roi',
    'hue_order':['roi','non-roi'],
}
receptor_groups = expression_long['receptor_group'].unique()
sns.color_palette("Paired")
with sns.plotting_context("paper"):
    fig, axes = plt.subplots(4, 2, figsize=(20,  30))
    for ax_row_idx, ax_row in enumerate(axes):
        for ax_col_idx, ax in enumerate(ax_row):
            ax_idx = ax_row_idx * 2 + ax_col_idx
            if ax_idx >= len(receptor_groups):
                ax.set_axis_off()
                continue
            ax_group = receptor_groups[ax_idx]
            expression_long_group = expression_long.loc[expression_long.receptor_group==ax_group, :]
            group_receptors = expression_long_group['receptor'].unique()

            sns.boxplot(ax=ax, data=expression_long_group, **plot_params)
            annot.new_plot(
                ax,
                data=expression_long_group,
                pairs=[((receptor,'roi'),(receptor,'non-roi')) for receptor in group_receptors],
                plot='boxplot',
                **plot_params
            ).configure(test="Mann-Whitney").apply_and_annotate()

            ax.set_title(ax_group)
            if len(group_receptors) > 10:
                ax.set_xticklabels(labels=ax.get_xticklabels(), rotation=45)
plt.show()

Which results in this approximation of your diagram :) Tweaking spacing and legends, groups ordering, maybe using the last row for your larger group (look for add_subplot) should enable you to get there though.

expression

JohannesWiesner commented 9 months ago

Perfect, thanks so much for the code! Then only issue that I see right now, is that the multiple comparisons correction is now done within each group and not over all receptors right?

trevismd commented 9 months ago

Of course!

This is correct, but it is also the case with plot_and_annotate_facet (I should make that clearer).

Depending on the correction method, you can fix this by either

passing a num_comparisons option (like for Bonferonni) or
running the stats beforehand and then use set_pvalues instead on each subplot. In that case, you'll have to
1. Compute all the pairs you use in the plots
2. Plot a chart with receptors of all groups, but using the pairs described above
3. Collect the pvalues for each pair
4. Use these when you're making the "real" plot as drafted above.

JohannesWiesner commented 9 months ago

This is correct, but it is also the case with plot_and_annotate_facet (I should make that clearer).

Ah, interesting! Yes, I think making that clearer would help a lot :)

passing a num_comparisons option (like for Bonferonni)

That sounds like a good idea, but would only work for methods, that do not need to know all the p-values before-hand right?

JohannesWiesner commented 8 months ago

Of course!

This is correct, but it is also the case with plot_and_annotate_facet (I should make that clearer).

Depending on the correction method, you can fix this by either

passing a num_comparisons option (like for Bonferonni) or

running the stats beforehand and then use set_pvalues instead on each subplot. In that case, you'll have to

Compute all the pairs you use in the plots

Plot a chart with receptors of all groups, but using the pairs described above

Collect the pvalues for each pair

Use these when you're making the "real" plot as drafted above.

Would love if this would work out-of-the-box! The general idea here is that you often want to plot stuff using facet_grid for better readability but you don't want the multiple comparison to be done within each subplot.

trevismd / statannotations

How to plot facet grid with hue argument? #135