tdsmith / aRrgh

A newcomer's (angry) guide to data types in R
Other
307 stars 14 forks source link

*apply functions #18

Open xiongchiamiov opened 10 years ago

xiongchiamiov commented 10 years ago

There are a variety of apply-type functions available in R. Here's what I've figured out so far:

lapply and sapply both loop over a list; for each element in the list, they call the function with the element as a parameter.

R: sapply(list(c(1, 3, 5), c(2, 4, 2)), sum) Python: map(sum, [[1, 3, 5], [2, 4, 2]])

lapply will always return a list, while sapply attempts to simplify the result to a more concise object (since lists are not as concise as I'm used to in other languages).


mapply is sapply with multiple arguments passed to the function.

R: mapply(sum, list(c(1, 3, 5), c(2, 4, 2)), list(10, 100)) Python: map(sum, [[1, 3, 5], [2, 4, 2]], [10, 100])

This means that mapply with only the function and one argument can be used as a replacement for sapply:

> sapply(list(c(1, 3, 5), c(2, 4, 2)), sum)
[1] 9 8
> mapply(sum, list(c(1, 3, 5), c(2, 4, 2)))
[1] 9 8

Note the order of the arguments is changed. As far as I can tell, this is intended to make the functions maximally confusing.


Finally, apply applies a function to what R refers to as an array, which looks much more like a matrix to me. It can either apply the function to entire rows or entire columns; for some obtuse reason, applying to rows requires MARGIN to be set to 1, while applying to columns requires it to be 2.

Given

library(datasets)
data(mtcars)

R: apply(mtcars, 2, mean)['mpg'] SQL: select avg('mpg') from mtcars;


Finally, tapply groups one thing by another, then applies a function to the groups.

Again using mtcars,

R: tapply(mtcars[['mpg']], mtcars[['cyl']], mean) SQL: select avg('mpg') from mtcars group by 'cyl';

tdsmith commented 10 years ago

This deserves mention! I've been avoiding these because I hate them and I haven't figured them out yet; all the R I've been writing has involved processing data frames with plyr, which sidesteps the issue. :| Thanks for the primer.