holoviz / holoviews

With Holoviews, your data visualizes itself.
https://holoviews.org
BSD 3-Clause "New" or "Revised" License
2.66k stars 396 forks source link

Bug with hvplot plotly violin plots using 'by' argument #6236

Closed sam-spence closed 1 month ago

sam-spence commented 1 month ago

Violin plots on hvplot with the plotly extension do not work properly when using the 'by' argument to make a violin for each category. Instead of one violin per category as expected, only the first letter of each category is used, and violins of categories with the same first letter are stacked. The legend shows only the first letter of each category name instead of the full names.

This can also be seen in the hvplot documentation. For example the violin plot at the bottom of this page, using bokeh, in which each category has its own violin and its full name shown on the x axis and on the legend. This is the expected behaviour.

Compare with the same plot using the plotly extension, you can see that any categories that share the same first letter are stacked e.g. OO and OH are stacked and are both labelled as O.

Software versions, although I think it doesn't matter since the bug is visible in the documentation as well: Python 3.12 running on mac os 13.3.1 and on a Linux server Holoviews 1.18.3

Reproducible code:

import numpy as np
import hvplot.pandas
import hvplot.dask

hvplot.extension('plotly')
from hvplot.sample_data import us_crime, airline_flights
flights = airline_flights.to_dask().persist()
bugged_plot = flights.hvplot.violin(y='depdelay', by='carrier', ylim=(-20, 60), height=500)
hvplot.show(bugged_plot)

hvplot.extension('bokeh')
correct_plot = flights.hvplot.violin(y='depdelay', by='carrier', ylim=(-20, 60), height=500)
hvplot.show(correct_plot)