tidyverse / tidyr

Tidy Messy Data
https://tidyr.tidyverse.org/
Other
1.38k stars 417 forks source link

Add '.id' argument to unnest #125

Closed cashoes closed 8 years ago

cashoes commented 9 years ago

When calling unnest on a named list, preserve names as an additional column. The name of the column should be specified by the .id argument.

l <- lapply(1:5, function(x) runif(5))
names(l) <- LETTERS[1:5]

df <- data.frame(x = 1:5, y = I(l))

unnest(df, y, .id = 'names')

# Source: local data frame [25 x 3]
# 
#        x names         y
#    (int) (chr)     (dbl)
#1      1     A 0.6019313
#2      1     A 0.6589553
#3      1     A 0.8539686
#4      1     A 0.3740433
#5      1     A 0.6602022
#6      2     B 0.2701723
#7      2     B 0.7914102
#8      2     B 0.6098436
#9      2     B 0.6916693
#10     2     B 0.3024860
# ..   ...   ...       ...
hadley commented 9 years ago

This seems like a very reasonable parallel to dplyr::bind_rows().

hadley commented 8 years ago

Hmmm, what happens if you're unnesting two named columns?

l <- lapply(1:5, function(x) runif(5))
names(l) <- LETTERS[1:5]

df <- dplyr::data_frame(x = 1:5, x = l, y = l)
unnest(df, x, y, .id = "names")
lionel- commented 8 years ago

.id could be a vector as long as the number of columns to unnest.

hadley commented 8 years ago

What if you want to use the names from one, but not another? I guess it could be a named vector, or you could use NAs.

lionel- commented 8 years ago

You can also unnest in multiple steps in that case.

hadley commented 8 years ago

Currently takes single .id string - seems like the job of de-duplication is part of #184 (and if that's not enough, you can always do multiple unnests).