Closed mojones closed 3 years ago
Interestingly, this works if the data are passed as a numpy array, but fails with a list or Series
Noting that the same basic inconsistency exists in matplotlib too
plt.bar(["a", "b", np.nan], [1, 2, 3]) # Fails, same error
plt.bar(np.array(["a", "b", np.nan]), [1, 2, 3]) # Succeeds
Interestingly,
plt.plot(["a", "b", np.nan], [1, 2, 3])
succeeds, but shows nan
as a category, which is not what I would expect.
I think that supporting categorical data with missing values will either require upstream changes in matplotlib or seaborn defining its own converters and using those that handle missing data properly. While I think it may be necessary to take the latter route for planned updates to the categorical plotting module (which predates any support for categorical data in matplotlib) a downside would be less interoperability between seaborn and matplotlib plots.
I originally milestoned this for v0.11.1 but it seems like it might be more complicated than I expected and possibly requires/is best handled by upstream changes (https://github.com/matplotlib/matplotlib/issues/19139), so I unfortunately think this needs to be kicked down the road.
Not sure if this is intended behaviour, but it caught me out due to the difference in handling numerical/categorical data. I note that drawing histograms of categorical data is labelled as experimental, so ignore/close if that explains it.
With numerical data
histplot
ignores NaN and plots the other values, this is the behaviour I would expect:but with categorical data it crashes: