plotly / plotly_express

Plotly Express - Simple syntax for complex charts. Now integrated into plotly.py!
https://plot.ly/python/plotly-express/
MIT License
4 stars 0 forks source link

`category_orders` does not raise an error with incorrect column name #161

Open Rabeez opened 4 years ago

Rabeez commented 4 years ago

I ran into this quite accidentally and am not sure whether this behaviour is mentioned in the documentation or not.

This code produces the correct output as expected

px.histogram(data_frame=iris, x='sepalLength', facet_col='species',
             category_orders={'species': ['versicolor','virginica','setosa']})

newplot (2)

Whereas, using an incorrect column name as the key for category_orders creates a plot which is identical to the one created when no ordering is specified.

px.histogram(data_frame=iris, x='sepalLength', facet_col='species',
             category_orders={'foo': ['versicolor','virginica','setosa']})

newplot (1)

In my opinion this should raise a ValueError similar to when an incorrect column is specified for the usual arguments (x, y, color etc).

Plotly 4.2.1 Python 3.7.4

EDIT: I just checked this with color instead of facet_col and the same issue is present obviously.

nicolaskruchten commented 4 years ago

This is actually on purpose, to make it easier to iterate quickly. For example if you correctly specify the order as virginica/setosa/versicolor and then you re-execute the cell with a filter on df such that no versicolors come out, it's annoying to have to go back to comment out the order and then add it back in on another pass when you change the filter again and setosa is out but versicolor is back in.

As a side-effect, it doesn't warn you or fail on typos, but I think that's a decent tradeoff.

nicolaskruchten commented 4 years ago

ah, sorry, I responded too quickly! you're saying that the key is not in the df. I think that's OK too, personally, so you can set a bunch of category orders all at once in a dict and just re-use the dict across many figures, but that argument is less strong, admittedly.

Rabeez commented 4 years ago

@nicolaskruchten if I were making multiple figures where sharing the category order dict was an option I would probably have the same dataframe too, right. So having an error (or at least a stern warning 😂) about an incorrect column name would be useful rather than trying to figure out why the plot doesn't look right. Because let's be honest typos in column names are very common.