Open dpastoor opened 9 years ago
OK, a user with Python familiarity might google "python zip in r".
apply()
and variants are covered in the SWC R Novice lesson, so perhaps a beginner would think about apply()
functions generally, and mapply()
is in the reference material. mapply()
can also be found via the "See also" part of the lapply()
documentation, but not apply()
.
The 2nd result from searching "apply list of functions to columns of data frame" gets you to a StackOverflow mapply()
answer. You need to include "columns" because "r apply list of functions to data frame" doesn't get you there. In general it seems that mapply()
is mentioned less in various pages that talk about apply()
, sapply()
, and lapply()
.
frankly, even after playing with examples, mapply can be tough to grasp, IMO. The 'best' best chance I would think a user would have in figuring this out is understanding
In that case, as long as max speed wasn't an issue or terseness, there are a multitude of 'easy' ways to solve it.
funs <- list(fun1, fun2, fun3)
for (i in seq_along(df)) { df[[i]] <- funs[i] }
but for a beginner with a non-coding background, especially functions as objects is not something I would anticipate someone picking up naturally, and looks like it isn't covered in the SWC novice lessions (not surprisingly)
I would push back on this question, because it clearly doesn't scale. This is not a general novice problem, applying: unique functions 1 through n for variables 1 through n in a data.frame for n of any size. I'd ask: "what are you really trying to do?" I wouldn't just start trying to solve the problem, taking it as face value. It smells like someone describing the step, not the goal.
Completely agree - that was actually my first impression as well.
Its also really not ever going to have generalizable/flexible (in the sense that you'd need to write a custom function for each new column) and would likely be unable to use across df's.
Though, there are some situations where you do need to leverage this concept (from reading the source a while back I believe this is how qplot/ggplot do apply various bits to each layer).
@dpastoor You're right, it's an interesting puzzle. But not a legit novice question, I suspect. So @noamross you'll have to decide how to handle this situation, since it will come up a lot I suspect in other questions. Often novices pose rather thorny programming problems but, if you peel back the onion a bit, you can design the question away by, e.g., helping them use a more natural data structure. Lots of questions about iterating over this and that, in particular, go away, once you pick the right way to store or shape the data in R.
my answer to this one, for a novice, would be "go ahead and use a loop. Why not? Speed is unlikely to be a serious problem and you'll have a lot easier time understanding what you did."
@jennybc Thanks for the meta-response! I agree. At least a few example questions should illustrate this point. I have the advantage of knowing the questioner, so I might go back and see if we can peel back the onion now and see whether this would be a good example. (Though the question is a couple of years old; they have since become quite the expert and will probably be a SWC instructor soon.)
(refers to https://github.com/noamross/zero-dependency-problems-r/blob/master/apply-functions-to-columns.md)
seems like a perfect chance to use
mapply
This stack overflow discusses how to you use mapply like
zip
in pythonhttp://stackoverflow.com/questions/9281323/zip-or-enumerate-in-r