nathaneastwood / poorman

A poor man's dependency free grammar of data manipulation
https://nathaneastwood.github.io/poorman/
Other
338 stars 15 forks source link

[FEAT] Add nest(), nest_by() and unnest() #44

Closed nathaneastwood closed 4 years ago

nathaneastwood commented 4 years ago
jonocarroll commented 3 years ago

In case you're still considering supporting unnest(), here's a base equivalent I'm currently playing with

# nest all columns other than cyl and am into 'value'
x <- tidyr::nest(mtcars, value = -c(cyl, am))

#' tidyr-free unnest
#' 
#' @param data nested tibble
#' @param cols string (max 1) identifying the nested column to be unnested
#' 
#' @return unnested tibble, equivalant to tidyr::unnest(d, col)
unnest <- function(data, cols) {
  not_nested <- data[!names(data) == cols] 
  nested <- data[cols]
  not_nested_expand <- not_nested[rep(seq_len(nrow(data)), sapply(nested[[cols]], nrow)), ]
  nested_expand <- do.call(rbind, nested[[cols]])
  res <- cbind(not_nested_expand, nested_expand)
  class(res) <- class(data)
  res
}

identical(tidyr::unnest(x, value), unnest(x, "value"))
#> [1] TRUE

It's not necessarily the way you're approaching these things, and it's almost certainly not performant, but it works on this example, so it's a start.

nathaneastwood commented 3 years ago

Hi @jonocarroll thanks for the input!! This looks really great. I will take a closer look when I get a chance. I closed this issue mostly because it felt like nest_by() kind of does what nest() does but also because I don't yet have rowwise() calculations implemented and so nested column structures are a little pointless right now in {poorman}. I have a Google Summer of Code project which will hopefully get backed though and the aim is for the student to work on that.

One thing I would say is the implementation in {tidyr} uses ... and tidy-select semantics. {poorman} supports this so I would probably look to use that (see poorman:::dotdotdot() and the select helpers / select_positions() functionality). Would you at all be interested in submitting a PR? No pressure to do so.