Closed jsta closed 6 years ago
It works without error if the "dpar" value is instead passed as "nlpar". See https://github.com/mjskay/tidybayes/blob/master/R/add_predicted_samples.R#L221
I think I see what is going on, but Matt will have a better idea. brms
allows the user to set a distribution parameter with the dpar
argument, either mu
or sigma
. tidybayes
uses names(model$formula$pforms)
to check if those parameters were included and passes them along to brms
.
However, in your model we get the following:
> names(fit1$formula$pforms)
[1] "b1" "b2"
tidybayes
tries to pass those to the brms
parameter dpar
where it understandably fails.
Yup, that sounds like the issue. Looking into this more, I think a more correct way to get a list of dpars and nlpars from a brmsfit is going to be to use brms::parse_bf
.
Another question is if the auxpars
name was premature abstraction on my part, and just keeping it as dpars
and nlpars
in fitted_samples
makes the most sense. Unless I've missed something (very possible) the only other package I am aware of that this type of thing applies to at the moment is rethinking
(via tidybayes.rethinking
, which anyway needs to be updated). So the ideal fix might be to come up with a way to distinguish dpars
and nlpars
in brms
and then split auxpars
into dpars
and nlpars
in fitted_samples
.
Are there plans to merge tidybayes.rethinking
into the base package?
@mjskay Based on the brms
source code that throws the error extract_draws
is one option to get a list of valid dpars
for a brmsfit
object.
For example, using the same fit1
from the original post:
> names(brms:::extract_draws(fit1)$dpars)
[1] "mu" "sigma"
corresponds to the valid distributional parameters listed in the error message. Likewise, here is another example:
library(brms)
set.seed(1234)
dat <- data.frame(
count = rpois(236, lambda = 20),
visit = rep(1:4, each = 59),
patient = factor(rep(1:59, 4)),
Age = rnorm(236),
Trt = factor(sample(0:1, 236, TRUE)),
AgeSD = abs(rnorm(236, 1)),
Exp = sample(1:5, 236, TRUE),
volume = rnorm(236)
)
fit2 <- brm(
bf(count ~ Age + (1|visit), mu2 ~ Age), dat,
family = mixture(gaussian, exponential),
prior = c(prior(normal(0, 10), Intercept, dpar = mu1),
prior(normal(0, 1), Intercept, dpar = mu2),
prior(normal(0, 1), dpar = mu2)),
warmup = 150, iter = 200, chains = 2
)
fit2$formula$pforms
fitted(fit2, dpar = 'm2')
and the error:
Error: Invalid argument 'dpar'. Valid distributional parameters are: 'mu1', 'sigma1', 'mu2', 'theta'
and the corresponding extract_draws
lookup:
> names(brms:::extract_draws(fit2)$dpars)
[1] "mu1" "sigma1" "mu2" "theta"
The problem is that extract_draws
is a relatively expensive call for simply identifying the names of the parameters. The problem, as seen in the source code, is that parsing out the dpars
appears to be tightly intertwined in the extract_draws
function used for the fitted
method.
Emphasis on relatively expensive:
> start_time <- Sys.time()
> names(brms:::extract_draws(fit2)$dpars)
[1] "mu1" "sigma1" "mu2" "theta"
> end_time <- Sys.time()
> end_time - start_time
Time difference of 0.117537 secs
I think this might sort of work (and uses an exported function from brms):
> names(brms::parse_bf(fit1$formula)$dpars)
[1] "mu"
And:
> names(brms::parse_bf(fit2$formula)$dpars)
[1] "mu2" "mu1"
Because these work:
fitted(fit2, dpar = "mu1")
fitted(fit2, dpar = "mu2")
But these (which are in the results of the extract_draws
call) do not:
> fitted(fit2, dpar = "sigma1")
Error: Distributional parameter 'sigma1' was not predicted.
> fitted(fit2, dpar = "theta")
Error: Distributional parameter 'theta' was not predicted.
Only problem is that this does not work:
> fitted(fit1, dpar = "mu")
Error: Distributional parameter 'mu' was not predicted.
But I'm pretty sure we can just treat mu
as implying dpar = NULL
, because this does work:
fitted(fit1, dpar = NULL)
When I have time (possibly Thursday) I am going to try to write this up into a coherent issue to open on the brms
repo. I think that should help us clear things up.
Good find! The discrepancy between the "valid distributional parameters" given by the error message and what values work with the fitted
method is confusing.
I added a short term (hackish) fix for this based on brms::parse_bf
, and opened an issue on brms (paul-buerkner/brms#318) so that we can come up with a long term solution to do this correctly.
I'll leave this issue open until there is a long term fix. In the mean time, let me know if you come across anything the short term fix does not work for.
Looks like brms::parse_bf
is the canonical approach (but without the current hack of removing mu
, which is no longer necessary in recent versions of brms).
However, probably extracting these automatically by default (having default auxpars = TRUE
) is the wrong approach. Instead, I think it makes sense to have the default be not to include the auxpars (auxpars = FALSE
), and allow the user to either attempt to extract them all automatically (by setting auxpars = TRUE
) or to specify them manually (by passing names or a named list to auxpars
).
I think that sounds reasonable. Just one more thought. I believe the name auxpar
orginated from an earlier version of brms and has later been renamed to dpars
by me. The question is whether it might make sense to also rename it in tidybayes?
Good question. That probably makes sense --- the only other package I am aware of that might need something like this currently is rethinking
, whose link
function can provide multiple link-level predictions depending on the model. However, the link
function in that package doesn't give a name to these, so I see no reason not to defer to brms
to choose terminology :).
So TODO:
mu
hackauxpars
to dpar
. Possibly leave auxpars as an alias that generates a warning (I never know where to fall on backwards compatibility).dpar = FALSE
dpar
(to make it more discoverable)Sidebar: I'm changing this to dpar
(no s) instead of dpars
for consistency with brms.
The following data is from
vignette("brms_nonlinear")
. My guess is that an error is thrown because the model is nonlinear but I am not completely sure.