Open xiongchiamiov opened 10 years ago
This deserves mention! I've been avoiding these because I hate them and I haven't figured them out yet; all the R I've been writing has involved processing data frames with plyr, which sidesteps the issue. :| Thanks for the primer.
There are a variety of
apply
-type functions available in R. Here's what I've figured out so far:lapply
andsapply
both loop over a list; for each element in the list, they call the function with the element as a parameter.R:
sapply(list(c(1, 3, 5), c(2, 4, 2)), sum)
Python:map(sum, [[1, 3, 5], [2, 4, 2]])
lapply
will always return a list, whilesapply
attempts to simplify the result to a more concise object (since lists are not as concise as I'm used to in other languages).mapply
issapply
with multiple arguments passed to the function.R:
mapply(sum, list(c(1, 3, 5), c(2, 4, 2)), list(10, 100))
Python:map(sum, [[1, 3, 5], [2, 4, 2]], [10, 100])
This means that
mapply
with only the function and one argument can be used as a replacement forsapply
:Note the order of the arguments is changed. As far as I can tell, this is intended to make the functions maximally confusing.
Finally,
apply
applies a function to what R refers to as an array, which looks much more like a matrix to me. It can either apply the function to entire rows or entire columns; for some obtuse reason, applying to rows requiresMARGIN
to be set to 1, while applying to columns requires it to be 2.Given
R:
apply(mtcars, 2, mean)['mpg']
SQL:select avg('mpg') from mtcars;
Finally,
tapply
groups one thing by another, then applies a function to the groups.Again using
mtcars
,R:
tapply(mtcars[['mpg']], mtcars[['cyl']], mean)
SQL:select avg('mpg') from mtcars group by 'cyl';