Open davidhodge931 opened 1 month ago
Great question. I think the s in geom_xy_means could be about the fact that it's the mean at x and the mean at y. And the name could actually be shortened to geom_means... Thoughts? And residuals plural seems right because of their interdependence?
I definitely think singular "mean". I'd describe 1 point from this as being the mean of the x any y, or the xy mean. You could drop the xy, as it still works where there is 1 categorical axis. A key difference between this and the other mean functions is that this plots a point(s), rather than a line(s)...
Then I went down a bit of a rabbit-hole to try An alternative way of naming is below, which organises by the type of geom it is.. See below whether you like it..
geom_point_mean()
geom_vline_x()
geom_vline_xmean()
geom_label_xmean()
geom_vline_xmedian()
geom_vline_xquantile()
geom_vline_xmax()
geom_vline_xmin()
You could have equivalent label functions for most of the vline ones above. The above naming follows stat_ydensity()
, and xmax
/xmax
aesthetics by not having a underscore between x/y and the statistic type etc. The y functions could be as equivalent..
Less sure whether the lm stuff would work well in this format or not..
geom_line_lm()
geom_ribbon_lm_conf()
geom_ribbon_lm_pred()
geom_label_lm()
geom_segment_lm_conf()
geom_segment_lm_pred()
etc
Not sure about the lm predicted and residuals points..
I love APIs. It seems like an easy way to improve a package. But, yeah, easy to get bogged down - and normally no perfect way of doing things! A super useful/awesome package. Should put it on CRAN when its ready
I think this would be good for flow, as ggplot2 tends to gets people used to thinking in terms of geoms.
It would also work well with autocomplete if you load ggplot2 and ggxmean libraries. You'd get the more flexible ggplot2 function suggested first and then the ggxmean functions, which would be a nice order. Organising in this way would help new ggxmean users remember and find what is available in the ggxmean package.
Thank you for these ideas. I think this is an interesting general discussion. I find myself naming convenience layers as you suggest geom_Geom_Stat and but probably more often geom_Stat_Geom. I'm probably not ready to make any renaming moves in immediate future but I'll keep this open and in mind.
Regarding compound naming, elsewhere I've tending to create convenience layers naming the stat first
geom_means (and geom_means_point alias) geom_means_label geom_means_text
Which I think might be more user friendly. Elsewhere, for example, so you get pipelines like...
gapminder::gapminder |> filter(year == 2002) |> ggplot() + aes(area = pop, id = country, label = country) + geom_circlepack() + # layer of circles based on country area geom_circlepack_text() # label packed circles with country names
In ggstats, it looks like likert and diverging layers will follow this naming scheme too!
some_likert_data |> ggplot() + aes(y = question, fill = response) + geom_likert() + geom_likert_label() # label is automatically computed percentage.
Some likert naming and API discussions happened over in https://github.com/teunbrand/ggplot-extension-club/discussions
Firstly, awesome package!!
I think
geom_xy_means
should be singular, as it can be 1 point. This would also be consistent with the naming of the rest of your mean functions.I also think that the
*_segments
functions should prob be singular too.https://design.tidyverse.org/function-names.html
I understand why you've made the residuals one plural