tidyverse / purrr

A functional programming toolkit for R
https://purrr.tidyverse.org/
Other
1.27k stars 271 forks source link

Possible bug with list_transpose #1128

Closed DyfanJones closed 3 weeks ago

DyfanJones commented 2 months ago

I came across a bug in paws.common that does something similar to list_transpose. https://github.com/paws-r/paws/issues/791#issuecomment-2202681892.

When transposing a list the first element is taken to create the template (if one isn't already provided) https://github.com/tidyverse/purrr/blob/870696c7d9f3208298ea84a36d813ffd28e59e49/R/list-transpose.R#L78 If the first element is empty then list_transpose will return an empty list. Would taking the maximum length be more appropriate to avoid this?

# paws.common original implementation:
transpose_orig <- function (x) {
    if (any(found <- lengths(x) == 0)) {
        x[found] <- list(rep(list(), length.out = length(x[[1]])))
    }
    .mapply(list, x, NULL)
}

# paws.common new implementation:
transpose_new <- function (x) {
    lens <- lengths(x)
    if (any(found <- lens == 0)) {
        x[found] <- list(rep(list(), length.out = max(lens)))
    }
    .mapply(list, x, NULL)
}

obj <- list(
    var1 = list(),
    var2 = c(1,2,3)
)

transpose_orig(obj)
#> list()

transpose_new(obj)
#> [[1]]
#> [[1]]$var1
#> NULL
#> 
#> [[1]]$var2
#> [1] 1
#> 
#> 
#> [[2]]
#> [[2]]$var1
#> NULL
#> 
#> [[2]]$var2
#> [1] 2
#> 
#> 
#> [[3]]
#> [[3]]$var1
#> NULL
#> 
#> [[3]]$var2
#> [1] 3

purrr::list_transpose(obj, simplify = F)
#> list()

Created on 2024-07-02 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 (2024-06-14) #> os macOS Sonoma 14.4.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/London #> date 2024-07-02 #> pandoc 3.2.1 @ /opt/homebrew/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.3 2024-06-21 [1] RSPM (R 4.4.0) #> digest 0.6.36 2024-06-23 [1] RSPM (R 4.4.0) #> evaluate 0.24.0 2024-06-10 [1] RSPM (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] RSPM (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] RSPM (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] RSPM (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] RSPM (R 4.4.0) #> knitr 1.47 2024-05-29 [1] RSPM (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] RSPM (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.4.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.4.0) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.4.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] RSPM (R 4.4.0) #> rlang 1.1.4 2024-06-04 [1] RSPM (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] RSPM (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] RSPM (R 4.4.0) #> styler 1.10.3 2024-04-07 [1] RSPM (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] RSPM (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] RSPM (R 4.4.0) #> xfun 0.45 2024-06-16 [1] RSPM (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] RSPM (R 4.4.0) #> #> [1] /Users/dyfanjones/Library/R/arm64/4.4/library #> [2] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
hadley commented 1 month ago

I like this idea: it's a simple way to generally improve the default performance of list_transpose().

If someone tackles this for TDD, it'll need some documentation updates and a test.