hongooi73 / glmnetUtils

Utilities for glmnet
65 stars 18 forks source link

Predict fails when character column has fewer levels in new data #12

Closed jwdink closed 7 years ago

jwdink commented 7 years ago
library('glmnetUtils')
df <- data_frame(y=rnorm(100),
                 x=sample(c("level1","level2","level3"), size = 100, replace = TRUE))
fit <- glmnet(y ~ x, data = df)
predict(object = fit, newdata = filter(df, x!="level3"))

The user can avoid this problem by never using character columns, only factor columns. But it might be worth issuing a warning to the user if they use a character? Otherwise can be difficult to debug.

hongooi73 commented 7 years ago

Thanks for reporting this. The problem is that factor level info isn't retained in the model object. For now, as a workaround you can set use.model.frame=TRUE.

hongooi73 commented 7 years ago

4dd10ce97aab0a67c4f6920b72d34d1acfd4f435