tidyverse / forcats

šŸˆšŸˆšŸˆšŸˆ: tools for working with categorical variables (factors)
https://forcats.tidyverse.org/
Other
553 stars 126 forks source link

fct_collapse input errors can be bypassed by fixing only part of the inputs #273

Closed dchiu911 closed 3 years ago

dchiu911 commented 4 years ago

In fct_collapse, if we supply named numeric vectors instead of named character vectors, we get an error pertaining to fct_recode about the positions where we don't have named strings. However, it seems that fixing only part of the inputs with as.character() and leaving the others as numeric vectors is sufficient. I would have expected the error to tell me there are Problems at positions: 1, 2, 3, 4, 5, 11, 12, 13, 14, 15. Using development version of forcats.

library(forcats)
yrs <- as.character(rep(2001:2015, 10))
fct_collapse(
  yrs,
  `2001-2005` = 2001:2005,
  `2006-2010` = 2006:2010,
  `2011-2015` = 2011:2015
)
#> Error: Each input to fct_recode must be a single named string. Problems at positions: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

f1 <- fct_collapse(
  yrs,
  `2001-2005` = 2001:2005,
  `2006-2010` = as.character(2006:2010),
  `2011-2015` = 2011:2015
)
table(f1)
#> f1
#> 2001-2005 2006-2010 2011-2015 
#>        50        50        50

f2 <- fct_collapse(
  yrs,
  `2001-2005` = as.character(2001:2005),
  `2006-2010` = as.character(2006:2010),
  `2011-2015` = as.character(2011:2015)
)
table(f2)
#> f2
#> 2001-2005 2006-2010 2011-2015 
#>        50        50        50

Created on 2020-09-14 by the reprex package (v0.3.0)

hadley commented 3 years ago

It's because (effectively) all the inputs in ... are collapsed into a single vector, so a single character vector is enough to coerce them all. While not particularly desirable, I don't think it's worth the effort to fix this.