Closed kbodwin closed 1 year ago
I should add that changing the var
argument of add_predictions()
doesn't solve the problem since it renames the tibble rather than the column; e.g. var = "newname"
would produce the column newname$.pred_class
.
The type = "raw"
fix here is very helpful - thanks @kbodwin!. Would second the suggestion of making it a default in order to ensure that it's predictable what is being returned. In particular this seems a problem with tidymodels
as the same predict
method being called on an lm
object or a model_fit
object returns a vector in one case and a tibble in the other. Changing the default or adding in a test for object type maybe could be useful for smoothing the interaction of these two packages?
@kbodwin Thanks for the post! I agree something is wrong with add_predict when working with tidymodels. I used unnest() to fix the bug. See below: penguins %>% select(sex, bill_length_mm) %>% modelr::add_predictions(my_tm_glm) %>% unnest(cols = c(pred)) %>% head()
modelr is now superseded, which means that we'll only perform critical bug fixes needed to keep it on CRAN. Thanks for contributing this idea and my apologies that it took so long to inform you that this package is no longer under development.
Hi,
I wanted to open this issue before PR'ing, because maybe there's a good reason for the current setup that I'm not seeing.
Short version: I think
add_predictions()
should have default argumenttype = "raw"
, instead of the current defaulttype = NULL
.Here's why:
parsnip
introduces the method forpredict.model_fit
. Iftype = NULL
is passed topredict.model_fit
, it infers the type from the model object. This means that the output ofpredict()
is different depending on if you made your model withtidymodels
or not:Created on 2020-12-18 by the reprex package (v0.3.0)
I'm not a huge fan of
predict()
giving two different object types depending on input class, but I can accept there might be reasons for that.Where it breaks is when
predict()
is being called under the hood foradd_predictions()
:At best, the predictions column is weirdly named, since it's trying to coerce a tibble into a column. I've also run into applications where you get nested tibbles.
Of course, this can be circumvented with
type = "raw"
:... but that just seems highly counterintuitive, since the whole point of
add_predictions()
is a shortcut to plop a predictions column in there. It took me a long time to track down thetype = "raw"
fix. I have to imagine that "raw" represents nearly every use case, and anyone with a corner case need would know to fiddle withtype
.Anyways sorry this is so long, just wanted to be clear!