tidyverse / purrr

A functional programming toolkit for R
https://purrr.tidyverse.org/
Other
1.28k stars 272 forks source link

Support for the idea of `flatten_if()` and `squash()` in `list_flatten()` #1064

Open DavisVaughan opened 1 year ago

DavisVaughan commented 1 year ago

I added this to dplyr, but it probably belongs in purrr and would make for an easier transition path for the deprecated rlang::flatten_if() and rlang::squash() functions

See https://github.com/r-lib/rlang/pull/1576 and https://github.com/tidyverse/dplyr/pull/6759

#' @param x A list
#' @param fn An optional function of 1 argument to be applied to each list
#'   element of `x`. This allows you to further refine what elements should be
#'   flattened. `fn` should return a single `TRUE` or `FALSE`.
#' @param recursive Should `list_flatten()` be applied recursively? If `TRUE`,
#'   it will continue to apply `list_flatten()` as long as at least one element
#'   of `x` was flattened in the previous iteration.
#' @noRd
list_flatten <- function(x, ..., fn = NULL, recursive = FALSE) {
  check_dots_empty0(...)

  obj_check_list(x)
  x <- unclass(x)

  loc <- map_lgl(x, obj_is_list)

  if (!is_null(fn)) {
    loc[loc] <- map_lgl(x[loc], fn)
  }

  not_loc <- !loc

  names <- names(x)
  if (!is_null(names)) {
    # Always prefer inner names, even if inner elements are actually unnamed.
    # This is what `rlang::flatten_if()` did, with a warning. We could also
    # use `name_spec` and `name_repair` for a more complete solution.
    names[loc] <- ""
    names(x) <- names
  }

  x[loc] <- map(x[loc], unclass)
  x[not_loc] <- map(x[not_loc], list)

  out <- list_unchop(x, ptype = list())

  if (recursive && any(loc)) {
    out <- list_flatten(out, fn = fn, recursive = TRUE)
  }

  out
}
olivroy commented 6 months ago

If I understand correctly, this would solve the following problem right?

I have a list of this format. (duplicate names, but similar structure)

l1 <- list(x = c(el1 = 1), x = c(el2 = 2, el3 = 3), y = c(el1 = 1))
l1
$x
el1 
  1 

$x
el2 el3 
  2   3 
$y
  el1
  4

I'd like to apply a purrr transformation to change it to

list(x = c(el1 = 1, el2 = 2, el3 = 3), y = c(el1 = 1))

I tried using list_flatten(), but it insists on keeping l1 structure unchanged.

I may have gotten lost, but couldn't find an example that does this.

I was able to almost solve this with base R unlist()

which ends up giving

unlist(l1)
x.el1 x.el2 x.el3 y.el1
    1     2     3   1

but I don't really trust how unlist() handles pretty much anything in a surprising way..

unlist() is also not mentioned in the vignette https://purrr.tidyverse.org/articles/base.html (which doesn't reflect the 1.0 API exactly) i.e. mentions map_df*() functions, and doesn't mention newly introduced functions.

What would be a close equivalent of purrr vs unlist() ?

list_c() almost works, but it would be great if it had a name_spec argument.

purrr::list_c(l1)
el1 el2 el3 el1 
  1   2   3   2 

Loses the names

Edit: from https://github.com/tidyverse/purrr/pull/998, vctrs::list_unchop() seems to have a way to deal with names, maybe a xref could be inserted somewhere...