mwaskom / seaborn

Statistical data visualization in Python
https://seaborn.pydata.org
BSD 3-Clause "New" or "Revised" License
12.58k stars 1.93k forks source link

catplot with numeric hue and hue_order: empty legend handles #3639

Open jhncls opened 8 months ago

jhncls commented 8 months ago

Tested with Seaborn 0.13.2, pandas 2.2.1

When the hue values are numeric, hue_order isn't respected in the plot. The legend does respect the order, but with empty legend handles.

import seaborn as sns
import pandas as pd

tips = sns.load_dataset('tips')
sns.catplot(tips, kind='box', x='time', y='tip', hue='size', hue_order=[2, 3, 4])

image

Making the hue column of type pd.Categorical, but still numeric, does respect the hue order. But again with empty legend handles. The default palette changes to categorical.

import seaborn as sns
import pandas as pd

tips = sns.load_dataset('tips')
tips['size'] = pd.Categorical(tips['size'])
sns.catplot(tips, kind='box', x='time', y='tip', hue='size', hue_order=[2, 3, 4])

image

mwaskom commented 8 months ago

When the hue values are numeric, hue_order isn't respected in the plot.

This is expected

The legend does respect the order, but with empty legend handles.

This is probably an artifact of the different handling for the catplot legend

jhncls commented 8 months ago

The numeric hue values behave a bit unexpected when somebody wanted to superimpose a stripplot on a boxplot via a catplot. The plots don't align if not all hue values are present in all subplots. It only aligns well for non-numeric hue values of type pd.Categorical.

import seaborn as sns
tips = sns.load_dataset('tips')
common_kws = dict(x='time', y='tip', hue='size', dodge=True, palette='turbo')
g = sns.catplot(tips, kind='box', col='smoker', fill=False, **common_kws)
for s, ax in g.axes_dict.items():
    sns.stripplot(tips[tips['smoker'] == s], s=8, legend=False, ax=ax, **common_kws)

image

Anyway, it is a bit surprising that hue_order can be used for subsets, supersets and ordering, but only for strings. (I am not asking to make any changes. I understand it is complicated, where people sometimes interpret numbers as discrete and sometimes as continuous.)