Open john-d-fox opened 2 years ago
I planned to write almost exactly the same thing.
Although very efficient, the function tapply()
can be quite cryptic for many users especially when splitting by more than one factor, when the split argument has to be a list.
There is an alternative in the most recent version of package admisc (0.30), which I find a lot more intuitive and easier to remember:
using(airquality, mean(Ozone, na.rm = TRUE), split.by = Month)
mean
5 23.615
6 29.444
7 59.115
8 59.962
9 31.448
Additionally, instead of:
mtcars$gear_char <-
ifelse(mtcars$gear == 3,
"three",
ifelse(mtcars$gear == 4,
"four",
"five")
)
this is arguably also more intuitive:
mtcars$gear_char <- recode(mtcars$gear, "3 = three; 4 = four; 5 = five")
I think the object was to do this without loading non-base-R packages. If that requirement is relaxed, there's also the Tapply()
function in the car package, which provides a formula interface to tapply()
.
Indeed.
To me, "base R" is anything not related to the tidyverse dialect, using classic, traditional R code. The base
package surely cannot do everything, and comparing it (alone) with the whole tidyverse is more than unfair.
It's my impression that 'base R' typically refers not just to the base package but to the R packages loaded by default at start-up or the packages in the standard R distribution.
You are correct, that should be the interpretation of the 'base R'. But even so, the tidyverse dialect is orders of magnitude bigger, so that comparing (I believe) is still unfair without contributed packages using standard R code. If my understanding is correct, the point of the TidyverseSkeptic is to make a fair comparison between 'traditional' R and the tidyverse dialect.
Great discussions, and again, sorry I'm late to it. I just today looked at the Issues posts.
Once again, though, my overriding goal is to make things easy for beginners. That excludes using other packages, for instance.
As to tapply(), I'm not offering it as a panacea, just something I think is easier for noncoders to learn and use.
If tapply() doesn't quite work, I recommend that beginners--the horror!--write a loop.
Dear Norm,
I've enjoyed the various versions of your tidyverse critique and largely agree with it. I noticed the following error in the current version, which I don't believe has been flagged before:
Your
tapply()
example doesn't handleNA
s consistently.The following would be consistent with the tidyverse solution and
aggregate()
:Though it requires more explanation, it encourages what I believe to be a better habit.
Best, John