tidyverse / tibble

A modern re-imagining of the data frame
https://tibble.tidyverse.org/
Other
671 stars 130 forks source link

as_tibble.data.frame() treats attr()-type attributes inconsistently: sripts "n" but not others. #1573

Closed krivit closed 6 months ago

krivit commented 6 months ago

as_tibble.data.frame() appears to strip some custom attributes. So far, I've only noticed this for n, but there may be others. I think it should either strip all the non-standard attributes or preserve them all, and document the behaviour either way.

library(tibble)
df <- data.frame(x=1:2, y=3:4)
attr(df, "n") <- 5
attr(df, "m") <- 6

attributes(df) # data.frame has m and n attributes.
#> $names
#> [1] "x" "y"
#> 
#> $class
#> [1] "data.frame"
#> 
#> $row.names
#> [1] 1 2
#> 
#> $n
#> [1] 5
#> 
#> $m
#> [1] 6

attributes(as_tibble(df)) # tibble has m but not n.
#> $class
#> [1] "tbl_df"     "tbl"        "data.frame"
#> 
#> $row.names
#> [1] 1 2
#> 
#> $names
#> [1] "x" "y"
#> 
#> $m
#> [1] 6

Created on 2024-04-02 with reprex v2.1.0

qmarcou commented 6 months ago

It seems dropping attributes was an ongoing discussion last year (see #769 ) I also just experienced a problem with remnant attributes after conversion to tibble that breaks some automated testing utilities. For instance converting back and forth a tibble to data.table breaks equality using testthat:

test_df = tibble(a = c(1,2,3))
converted_df = a %>% data.table::as.data.table() %>% tibble::as_tibble()
testthat::expect_equal(test_df, converted_df)
# Error: `test_df` (`actual`) not equal to `converted_df` (`expected`).

# `attr(actual, '.internal.selfref')` is absent
# `attr(expected, '.internal.selfref')` is a pointer

due to the '.internal.selfref' data.table attribute. I would tend to think this is an undesirable behavior.

krlmlr commented 6 months ago

Thanks. The discussion in #769 is unrelated, the original issue is a genuine bug. Running revdepchecks and releasing to CRAN.