biolab / orange3

🍊 :bar_chart: :bulb: Orange: Interactive data analysis
https://orangedatamining.com
Other
4.81k stars 1k forks source link

Violin Plot with data subset #5644

Open robertcv opened 2 years ago

robertcv commented 2 years ago

What's your use case? I would like for the Box Plot widget to have the option of Data Subset like it is in the Scatter Plot widget. This option would add points to the drawn box plots to show where on the distribution the specific subset lies.

A simple mockup: Screenshot from 2021-10-13 10-18-20

janezd commented 2 years ago

@robertcv, we discussed this today and found a solution that we think is better. Box plots are not point-based (or instance-based) visualizations. Violin plots are. The plot could (probably) easily show the subset by making the area corresponding to the entire data brighter, and then superimposing a darker plot for the subset.

franktoffel commented 1 year ago

What about beeswarm plots?

They are arguably better as they clearly the number of points behind the distribution. https://datascience.stackexchange.com/questions/71709/how-is-the-beeswarm-plot-better-than-a-histogram

I prefer SHAP implementation (which doesn't add curvature) https://shap.readthedocs.io/en/latest/example_notebooks/api_examples/plots/beeswarm.html

https://github.com/eclarke/ggbeeswarm