Closed davidkane9 closed 3 years ago
Thanks for reporting this. Yeah, even though in this case the newdata
argument isn't necessary at all I agree that there shouldn't be an error with an empty data frame (tibble or regular data frame).
Thanks for taking a look at this.
the newdata argument isn't necessary
Well, I certainly find it highly desirable for teaching. An intercept-only model is one of the first things I show students with stan_glm() and it would be very nice to have posterior_predict() return a matrix with one column when given an empty tibble, as opposed to the default of one column for every data point, which is an object which new students find very confusing.
Oh I see what you mean. Forget what I said about it being unnecessary. When there are additional RHS variables the number of observations to predict is specified by the number of rows in newdata
, so to be consistent with that I would think we'd allow tibbles with no columns but there would still have to be a positive number of rows. So tibble(.rows = 1)
, although uglier than tibble()
, would be more consistent with the behavior of newdata
in general because tibble()
gives you 0 rows.
And the more I think about this the more surprised I am that this hasn’t come up before because intercept only models are definitely a natural place to start teaching. Glad you noticed this!
So tibble(.rows = 1)
That makes sense.
hasn’t come up before
Few teachers are as crazy as I am, teaching stan_glm() in a beginners' statistics class . . . .
This turned out to be really easy to fix (I just opened PR #492). It was already set up to work properly but one of our checks validating the newdata argument was too strict, resulting in an error with 0 columns when that was unnecessary. If I remove that check when there are 0 columns then this works:
fit <- stan_glm(mpg ~ 1, data = mtcars)
posterior_predict(fit, newdata = tibble::tibble(.rows = 1))
posterior_epred(fit, newdata = tibble::tibble(.rows = 1))
Summary:
posterior_predict() (and related functions like posterior_epred()) should accept an empty tibble as the newdata argument when working with a model whose only right-hand side variable is the intercept.
Reproducible Steps:
Created on 2020-12-09 by the reprex package (v0.3.0)
RStanARM Version: 2.21.1
R Version: 4.0.3
Operating System: MacOS Catalina: 10.15.7