Open Midnighter opened 1 year ago
Hi @Midnighter, As far as I know there is no direct way of doing this but you could do this in a two step way. It may not be very pretty though...
import plotly.express as px
df = px.data.tips()
pts = px.strip(df, y="total_bill").data
fig = px.violin(df, y="total_bill", points=False)
fig.add_traces(pts)
fig.show()
Thanks for the answer. For space reasons this is preferable to me but I agree that it would look better with more specific placement of the points.
You can do this with a single trace, using pointpos
:
import plotly.express as px
df = px.data.tips()
fig = px.violin(df, y="total_bill", points="all")
fig.update_traces(pointpos=0)
fig.show()
Gives an identical result to what @Alexboiboi showed. But still you're right, the jitter algorithm was made for box plots rather than violins. If anyone is interested to make a PR to plotly.js it should be relatively easy to add an option to use the KDE as the jitter envelope, to achieve the effect you're looking for.
@alexcjohnson thank you for your addition. I've never looked at the plotly.js source code so far. Could you provide me with a link to where you think this new feature would need to be inserted, please?
The existing jitter algorithm is in traces/box/plot.js - even though it's in the box trace, it also gets used by violins here.
I think if you're in there from a violin trace you should have access to the density array. I'd suggest not supporting this for box traces, for now anyway, as that would require a separate calculation of the KDE.
We'll need a new violin attribute, maybe call it jittermode: 'box'|'kde'
. Violins also reuse box point attribute defaults, which sets jitter
and pointpos
here - to make this easiest to use we can modify that so if using the new 'kde'
mode the defaults are pointpos: 0, jitter: 1
.
Huge fan of this idea! Note that at the moment the box logic does some approximation of this, by broadening the jitter width where there are a lot points, no? It would be nice to have this a bit better lined up with the violin trace. There's also the "beeswarm" type of jittering which is not random but geometry-aware: points are laid out so as to form a compact group without overlapping, but this would need to happen lower down in the pipeline I think.
One of my favorite types of plots to show distributions is to use Sina plots where the points are spread out within the area of the violin representing the density of the distribution. I then additionally overlay this with a boxplot as is already possible in plotly. So my questions/feature requests are then:
Thank you for your consideration.