tidymodels / hardhat

Construct Modeling Packages
https://hardhat.tidymodels.org
Other
103 stars 17 forks source link

Novel level handling #85

Closed DavisVaughan closed 5 years ago

DavisVaughan commented 5 years ago

Currently, scream() hands off to vec_cast() and doesn't do much else. This worked fine in the past, but dev vctrs now throws an error when there is a lossy cast, rather than a warning. This means that the case of "novel levels" in the test dataset is now an error, rather than a warning that coerces to NA.

I think for the modeling world, we still want this to coerce to NA, so we need a function that handles this before calling vec_cast()

DavisVaughan commented 5 years ago

vctrs still has the nice property of silently altering the novel levels if the actual data does not use those levels. i.e.

dat <- data.frame(
    y = 1:4,
    f = factor(letters[1:4])
  )

  new <- data.frame(
    y = 1:4,
    f = factor(letters[c(1:3, 5)])
  )

new <- new[1:3,]

x <- mold(y ~ f, dat, blueprint = default_formula_blueprint(indicators = FALSE))

# no error thrown
forge(new, x$blueprint)
github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.