Open chrisroat opened 2 years ago
Something like this is on the roadmap, but I don't think this is the right API. Better to accept a list of grouping variables (e.g. common_norm=["col", "hue"]
) which
jointplot
)size
/style
)Thanks for the follow-up. I like the more general approach. If I understand what you propose, then for a "facet" normalization one would use ["row", "col"]?
If it exists, can you provide a link to the roadmap piece that would cover this.
Thinking on the relevant issue of Plotting conditional distribution with different hues, and looking at the codes,
I think it would be impossible using sns.FacetGrid()
because sns.FacetGrid()
splits the data according to col, row, and hue so there is no way to find out conditional probability conditioning on col, row, etc.
The easiest the work-around would be calculate the estimated conditional probability for oneself, and use something more direct way of plotting bars, here's the example,
BankWages['gender'] = BankWages['gender'].astype('category')
# you should use .groupby() according to the conditional probability you want to visualize
df_plot = BankWages.groupby(['minority'])[['gender', 'job']].value_counts(normalize=True).reset_index()
def plt_bar(x, y, hue, **kwargs):
if 'color' in kwargs:
del kwargs['color']
ax = plt.gca()
#print(kwargs['color'])
for icat, cat in enumerate(hue.cat.categories):
#print(cat)
color = sns.color_palette()[icat]
ax.bar(x=x[hue==cat], height=y[hue==cat], color=color, **kwargs)
return ax
g = sns.FacetGrid(df_plot, col='minority')
g.map(plt_bar, 'job', 'proportion', 'gender', width=0.8, alpha=0.5)
for plots like setting multiple='dodge'
, it seems to require more code something like
this grouped bar chart
But then again if we can bring the parameter common_norm=
to sns.FacetGrid(), and make some exceptions on how to split the data, we might be able to plot bar plot of conditional probability conditioning on the variables listed in common_norm=
. After all, displot()
seems to use sns.FacetGrid()
I wonder what kind of plots needs more than a row, col, hue-level conditional estimates?
To be complete, the objects interface accepts the list-based common_norm
values mentioned earlier, e.g. common_norm=["row","col"]
will group only on the row and columns, disregarding the color.
@kwhkim the issue with your suggestion is that common_norm
is a parameter of histplot
specifically, which complicates things. Also, with your new example, why are you not using a catplot
with kind="bar"
then ?
@thuiop catplot
seems to be a great suggestion! And it works fine. I thought it only works for visualizing means or summary statistics alike.
g = sns.catplot(df_plot,
kind='bar', col='minority', hue='gender',
hue_order = ['male', 'female'],
x='job', y='proportion',
height = 2, aspect = 7/2/2)#, errorbar=None)
I think"the final piece of the puzzle" would be visualizing heat map or 2d-histogram with different conditional probabilities, something like,
sns.displot(data=BankWages, row='minority',
x='job', y='gender', cbar=True, # cbar : colorbar
height=3, aspect=1*3/2,
stat='probability', common_norm=False)
This one looks impossible to solve without sns.FacetGrid()
and sns.heatmap()
... Can seaborn
objects solve this?
In making a displot with
stat="percent"
, there are two normalizations, controlled by the booleancommon_norm
. Either the entire figure is normalized, or each individual group is normalized. I would find it useful to allow per-facet normalization, as well.I'd like to propose that common_norm be allowed to take on string values like
"figure"
(same asTrue
today),"group"
(same asFalse
today),"facet"
, and"hue"
.