bird-team / brisbane-bird-atlas

Atlas of the Birds of Brisbane: Community bird atlas for Brisbane, Australia
https://brisbanebirds.com
GNU General Public License v3.0
3 stars 0 forks source link

Alternatives to box plots for some charts #148

Open Louis-Backstrom opened 5 years ago

Louis-Backstrom commented 5 years ago

Box plots for count and elevation data have been irking me for some time - given how many outliers we have in such large datasets I find they're not particularly informative, especially when they are so tiny. See e.g. Crested Pigeon:

image

One alternative that looks promising to me is the violin plot - which is a part of native ggplot2 (geom_violin I believe) - it retains all of the functionality of the box-and-whisker plot, more or less, but will show more clearly the patterns in the data when it's as clustered as it is.

When I get the chance to this week I'll try and experiment with a build of the assets using the geom_violin() parameter to see what it looks like, but thought I'd just bring it up here to see if you all had any thoughts?

jeffreyhanson commented 5 years ago

Nice idea! Yeah, the box plots aren't really that informative. Personally, I'm not much of a fan of violin plots because they require a smoothing parameter and, depending on the parameter, this can hide the "real" shape of the data. Perhaps a something like a mean +/- standard errors might work better?

Louis-Backstrom commented 5 years ago

Maybe - I'm not too sure to be honest, just figured we could definitely do better than boxplot. I liked the idea of the violin (or similar density-type plot) because it could potentially highlight multimodal distributions and the like, which a box would generally hide.

@dbl3raf you've got more experience than I - any ideas?