tidyverse / tibble

A modern re-imagining of the data frame
https://tibble.tidyverse.org/
Other
671 stars 130 forks source link

Possible edge case bug while identifying whether elements are in column #1584

Open nicohlara opened 4 months ago

nicohlara commented 4 months ago

When trying to check if a particular value is in a column, tibbles behave differently than base dataframes (or themselves) if the column is referenced by string in brackets: data_frame[, 'column_name'] rather than if the column is referenced by symbol: data_frame$column_name

library(tibble)
tbl <- tibble(a = seq(1, 5),
              b = letters[1:5])

3 %in% tbl$a [1] TRUE 3 %in% tbl[ ,'a'] [1] FALSE 3 %in% data.frame(tbl)[ ,'a'] [1] TRUE


I would expect all three values to be `TRUE` as the calls should all be equivalent.
krlmlr commented 4 months ago

Thanks. This is intended, the tibble default is drop = FALSE . See also https://tibble.tidyverse.org/articles/invariants.html .

library(tibble)
tbl <- tibble(a = seq(1, 5), b = letters[1:5])

tbl$a
#> [1] 1 2 3 4 5
tbl["a"]
#> # A tibble: 5 × 1
#>       a
#>   <int>
#> 1     1
#> 2     2
#> 3     3
#> 4     4
#> 5     5
tbl[, "a"]
#> # A tibble: 5 × 1
#>       a
#>   <int>
#> 1     1
#> 2     2
#> 3     3
#> 4     4
#> 5     5
tbl[["a"]]
#> [1] 1 2 3 4 5

df <- as.data.frame(tbl)
df[, "a"]
#> [1] 1 2 3 4 5
df[, "a", drop = FALSE]
#>   a
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5

Created on 2024-06-11 with reprex v2.1.0