plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
15.98k stars 2.53k forks source link

Axis labels are not shown for all subplots when using plotly express, facets and string labels #4452

Open b-a0 opened 9 months ago

b-a0 commented 9 months ago

Problem summary

Whenever I have the following combination:

The first subplot shows the axis labels properly, but all following subplots don't show labels at all:
Plot showing the issue
If the axis labels are numbers, there is no problem.

From a comment of the initial issue I opened I now know that fig.update_traces(bingroup='x2', row=1, col=2) can work around the issue, but that's not a permanent solution.

Reproducible example

I originally created these plots in Python (plotly 5.18.0) and then used the .to_json(pretty=True) method to obtain the Javascript for the codepens.

Original Python code

If necessary, this is the original Python code I used ```py import pandas as pd import plotly.express as px from plotly.subplots import make_subplots import plotly.graph_objects as go df = pd.DataFrame( { "age": { "0": "Adult", "1": "Adult", "2": "Adult", "3": "Adult", "4": "Adult", "5": "Kid", "6": "Kid", "7": "Kid", "8": "Kid", "9": "Kid", }, "favourite_food": { "0": "Pizza", "1": "Noodles", "2": "Pizza", "3": "Pizza", "4": "Pizza", "5": "Burger", "6": "Pancake", "7": "Noodles", "8": "Pizza", "9": "Pancake", }, "favourite_drink": { "0": "Beer", "1": "Tea", "2": "Beer", "3": "Wine", "4": "Coffee", "5": "Coffee", "6": "Water", "7": "Beer", "8": "Tea", "9": "Wine", }, "max_running_speed": { "0": 4.7362803248, "1": 16.7084927714, "2": 8.1135697835, "3": 1.0704264989, "4": 4.6330187561, "5": 6.331593807, "6": 16.5221040135, "7": 3.2256763127, "8": 4.3084468631, "9": 6.3677742299, }, "number_of_bicycles": { "0": 4, "1": 2, "2": 1, "3": 3, "4": 4, "5": 3, "6": 3, "7": 3, "8": 4, "9": 2, }, } ) df.set_index("age", inplace=True) working = make_subplots(rows=1, cols=2, subplot_titles=["Food", "Drink"]) working.add_trace( go.Histogram( histfunc="count", histnorm="percent", x=df.loc["Adult"].favourite_food, name="Adult", legendgroup="Adult", ), row=1, col=1, ) working.add_trace( go.Histogram( histfunc="count", histnorm="percent", x=df.loc["Kid"].favourite_food, name="Kid", legendgroup="Kid" ), row=1, col=1, ) working.add_trace( go.Histogram( histfunc="count", histnorm="percent", x=df.loc["Adult"].favourite_drink, name="Adult", legendgroup="Adult", ), row=1, col=2, ) working.add_trace( go.Histogram( histfunc="count", histnorm="percent", x=df.loc["Kid"].favourite_drink, name="Kid", legendgroup="Kid" ), row=1, col=2, ) working.show() broken = px.histogram( df, x=["favourite_food", "favourite_drink"], facet_col="variable", color=df.index, barmode="group", histnorm="percent", text_auto=".2r", ).update_xaxes(matches=None, showticklabels=True).update_yaxes(matches=None, showticklabels=True) broken.show() ```

Related

alexcjohnson commented 9 months ago

Thanks @b-a0 - let me add one more crucial condition to your list:

That's why you added update_xaxes(matches=None), but it's a slightly different statement from "each facet should have independent x axes" - normally with faceted categorical axes you want to ensure all the axes show the same categories, but here since you're faceting on variable perhaps it would make sense to automatically give each facet an independent set of labels. It's not immediately clear to me whether this part is a bug or a feature request, but I think your usage makes sense.

However, this also seems to be a plotly.js bug. Open your broken codepen and pan to the right on the first pane, you'll see the drink categories hanging out there, which I guess makes sense based on the bingroup attribute, but in this case we need the second x axis to also get these category names.

That said, if I'm understanding it correctly, for your case a simpler fix may be: .update_traces(bingroup=None). You probably want that anyway, in case there were ever some overlapping items and some non-overlapping items between the two sets of labels, if they all stayed in the same bingroup you could get some really weird outcomes. (Even if this fixes it for you, please leave this issue open since there are clearly things we want to address here)

b-a0 commented 9 months ago

.update_traces(bingroup=None) is indeed a more flexible workaround which I can use, thanks!

I tried with partially overlapping labels such that the set of category labels is no longer fully independent, but that didn't work out of the box either. Not sure what that tells us...

N = 100
# Added 'Beer' as a food and 'Noodles' as a drink
food = ["Dim sum", "Noodles", "Burger", "Pizza", "Pancake", "Beer"]  
drink = ["Beer", "Wine", "Soda", "Water", "Fruit juice", "Coffee", "Tea", "Noodles"]
df = pd.DataFrame(
    {
        "age": np.random.randint(8, 99, N),
        "favourite_food": np.random.choice(food, N, replace=True),
        "favourite_drink": np.random.choice(drink, N, replace=True),
        "max_running_speed": np.random.random(N)*20,
        "number_of_bicycles": np.random.randint(0, 5, N)
    }
)
df.age.replace({range(0, 19): "Kid", range(19, 100): "Adult"}, inplace=True)

fig = px.histogram(
    df,
    x=["favourite_food", "favourite_drink"],
    facet_col="variable",
    color="age",
    barmode="group",
    histnorm="percent",
    text_auto=".2r",
).update_xaxes(matches=None, showticklabels=True).update_yaxes(matches=None, showticklabels=True)

fig.show()

newplot

And after fig.update_traces(bingroup=None):

newplot

celia-lm commented 6 months ago

The main issue is that, by default, bingroup gets set to axis x for every facet, like:

for t in fig.data : 
    print(f"for trace linked to xaxis {t['xaxis']} the bingroup is {t['bingroup']}")

# output
# for trace linked to xaxis x the bingroup is x
# for trace linked to xaxis x2 the bingroup is x
# for trace linked to xaxis x the bingroup is x
# for trace linked to xaxis x2 the bingroup is x

So the issue can also be fixed with:

fig.for_each_trace(lambda trace: trace.update(bingroup=trace['xaxis']))

(which is a generalization of the fix @b-a0 commented in the first post)