plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
16k stars 2.53k forks source link

Sorted categories (Pandas `Categorical` series) aren't sorted in chart #3802

Open davidgilbertson opened 2 years ago

davidgilbertson commented 2 years ago

I am plotting a chart with a pd.Categorical series on the x-axis. The category has a sort defined.

Expected behaviour: the axis is sorted as per the underlying category definition Actual behaviour: the axis is sorted by the order of the data

df = pd.DataFrame(
    dict(
        Cost=[10, 20, 15, 2, 9],
        Rating=pd.Categorical(
            values=["High", "Low", "Medium", "Low", "Medium"],
            categories=["Low", "Medium", "High"],
            ordered=True,
        ),
    )
)

fig = px.scatter(df, x="Rating", y="Cost")

This fix would be fairly simple, I think, something like this, where series is whatever's being rendered to the xaxis (or y, or legend, etc)

if isinstance(series.dtype, pd.CategoricalDtype) and series.cat.ordered:
    fig.update_xaxes(
        categoryorder="array",
        categoryarray=series.cat.categories,
    )

Plotly version: 5.9.0.

I've noticed a number of bugs relating to pd.Categorical and pd.PeriodIndex, are there plans for better Pandas support, or is that not an area of focus?

rsnatorres commented 1 year ago

had the same problem here