tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.78k stars 2.12k forks source link

rename mutate! #1143

Closed Kaggle-Meetup-NYC closed 9 years ago

Kaggle-Meetup-NYC commented 9 years ago

I love, love, love dplyr -- let me say that first. But I object to using "mutate" to describe the addition of a new variable to a data frame based on the manipulation of other variables. "Mutate" is a synonym for "change" (but with a somewhat negative connotation), but in dplyr the command doesn't just change a dataset, it adds to it. In biology, for example, a mutation (usually) leaves the number of chromosomes intact; it just rearranges them so that they perform some new function. And given Hadley's (correct, in my view) aversion to changing data in place, the fact that the command adds a new column to the dataframe rather than changing what's already there is quite important. So I think the name is a little icky and gives the wrong idea about what the function actually does.

I would propose another member of the "-ate" family as a replacement: "generate". That has a positive connotation and, more importantly, denotes the addition of something, rather than just changing what's already there. It's perfect!

Perhaps there's a namespace conflict with "generate" being used by some other package (it's not in base R), but even so, I definitely think you should commandeer it for use in dplyr; it just makes sense.

Humbly submitted,

David Epstein

PS -- I just noticed that I'm submitting my recommendation under the Kaggle-Meetup-NYC account, but this has nothing to do with Kaggle, just my own personal opinion.

yanlinlin82 commented 9 years ago

I guess mutate is a name wihch could avoid conflicts. If rename, why not just some kind of neutral words such as change or modify.

Another simplier proposal would be set and/or unset:

df %>% set(newCol = 1)
df %>% unset(newCol)
df %>% set(newCol = NULL)
hadley commented 9 years ago

You can also use mutate to change existing variables and to delete them (by setting to NULL).

Regardless, this ship has sailed - it's too late to change the name now.