r-lib / vctrs

Generic programming with typed R vectors
https://vctrs.r-lib.org
Other
282 stars 65 forks source link

`vec_rbind` coerces POSIXlt to POSIXct #1930

Closed Moohan closed 2 months ago

Moohan commented 2 months ago

I was using purrr::list_rbind() but it seems like the behaviour/bug is coming from vctrs... When combining dataframes with a POSIXlt they are coerced to POSIXct, which is not expected.

Reprex:

library(tibble)

df_list <- list(
  df1 = tibble(
    dates_lt = as.POSIXlt(Sys.time()),
    dates_ct = as.POSIXct(Sys.time())
    ),
  df2 = tibble(
    dates_lt = as.POSIXlt(Sys.time() - 60),
    dates_ct = as.POSIXct(Sys.time() - 60)
  )
)

str(purrr::list_rbind(df_list))
#> tibble [2 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ dates_lt: POSIXct[1:2], format: "2024-04-24 15:02:37" "2024-04-24 15:01:37"
#>  $ dates_ct: POSIXct[1:2], format: "2024-04-24 15:02:37" "2024-04-24 15:01:37"
str(dplyr::bind_rows(df_list))
#> tibble [2 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ dates_lt: POSIXct[1:2], format: "2024-04-24 15:02:37" "2024-04-24 15:01:37"
#>  $ dates_ct: POSIXct[1:2], format: "2024-04-24 15:02:37" "2024-04-24 15:01:37"
str(vctrs::vec_rbind(!!!df_list))
#> tibble [2 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ dates_lt: POSIXct[1:2], format: "2024-04-24 15:02:37" "2024-04-24 15:01:37"
#>  $ dates_ct: POSIXct[1:2], format: "2024-04-24 15:02:37" "2024-04-24 15:01:37"
DavisVaughan commented 2 months ago

We push very hard towards POSIXct over POSIXlt, because the internal representation of POSIXlt is quite convoluted and memory heavy. See also https://github.com/r-lib/vctrs/issues/1576#issuecomment-1160511726.

We made a design decision long ago that <POSIXlt> + <POSIXlt> = <POSIXct> is the best way to push the more "standard" date-time class of POSIXct, even if that feels slightly weird from a purity standpoint.

If you really really want to avoid that, then you will likely have to use something other than vctrs or the tooling that builds on it to retain POSIXlt.