Closed mikepqr closed 8 years ago
I think this is just an inconsistency where https://github.com/mwaskom/seaborn/blob/master/seaborn/axisgrid.py#L274 should calculate the length of the col names and not the length of the unique categories.
And the fact that categorical_order
returns unused categories is very much by design, so that should not be changed.
Fair enough. Can confirm
nrow = int(np.ceil(len(col_names) / col_wrap))
works for me. Would you like a PR?
Yes that would be great. It would be doubly-good if you can add a small test that fails with the current code but passes with the fix.
Thanks. PR sent!
FacetGrid fails with an unhelpful matplotlib exception if the column being conditioned on is a pandas categorical, and not all categoricals are used.
Here's an example
This sets up a dataframe with random float values and, borough and neighborhood columns (strings).
works fine, which is to say 4 plots are generated, one for each of the neighborhoods in borough "A".
But if we convert neighborhood to a categorical it fails, presumably because the DataFrame passed to FacetGrid has a mismatch between the categories that actually appear in the DataFrame and the contents of df['neighborhood'].cat.categories :
with the exception:
It looks to me like the problem is that
categorical_order()
is returning all the categories, including the unused ones.Changing
categorical_order()
to return only used categories fixes the problem, in the sense that I get the same result whether or not the column is categorical.Given the semantics of pandas categories, you could make the case that seaborn should build an axes for every category, including the unused ones.
Should I submit a PR that plots only used categories? Or does someone smarter than me want to figure out how to makes plots for the unused categories?