tidymodels / hardhat

Construct Modeling Packages
https://hardhat.tidymodels.org
Other
101 stars 16 forks source link

Reconsider lax novel-level dropping / missing-level recovery with ordered factors #132

Closed DavisVaughan closed 4 years ago

DavisVaughan commented 4 years ago
library(vctrs)

# we used to rely on missing level recovery from vctrs
# but this is now an error
vec_cast(ordered("x"), ordered(c("x", "y")))
#> Error: Can't combine <ordered<5a425>> and <ordered<e1304>>.

# we also relied on vctrs to reorder factors in the case where you had them all, but they
# were in the wrong order. this isnt allowed either.
x <- ordered(c("x", "y"), levels = c("x", "y"))
y <- ordered(c("x", "y"), levels = c("y", "x"))
vec_cast(x, y)
#> Error: Can't combine <ordered<e1304>> and <ordered<d2f00>>.

# we also currently drop novel levels in ordered factors with a warning
# when `allow_novel_levels = FALSE`
hardhat:::check_novel_levels.factor(ordered(c("x", "y")), ordered("y"), "foo")
#> Warning: Novel levels found in column 'foo': 'x'. The levels have been removed,
#> and values have been coerced to 'NA'.
#> [1] <NA> y   
#> Levels: y

It is worth rethinking this a little bit. Perhaps we just need some custom code in hardhat that is going to continue to allow us to keep the current behavior with ordered factors, but it might also be good to be stricter with ordered factors

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.