push posterior summary through distribution function

Raoul-Kima commented 5 years ago

I didn't find a nice way to to the following with tidybayes. The purpose of this post is to ask whether i might have overlooked it. If it turns out this is not the case I will follow up with an example and pretty plots and motivation why such a feature would be useful.

Tidybayes allows for the following things:

Get posterior distributions of the linear predictor(s), (function linpred_draws) and posterior of those values pushed through the link function (function fitted_draws). Further it allows getting the posterior for the values arising when pushing these link-values through the assumed distribution (function predicted_draws).

This also allows for easily getting summaries, such as e.g. the mean prediction (apply function fitted_draws, then summarise values). What it (as far as I can see) does not allow is to push these summarised values (e.g. the mean prediction) through the assumed distribution to get e.g. the mean prediction of the arising data distribution.

Did I oversee the function to do this or does it not exist?

mjskay commented 5 years ago

Currently this functionality depends on the implementation of fitted_draws for different model types.

fitted_draws for brms models, for example, uses brms::fitted.brmsfit, which will give you means for the data distribution when you set scale = "response".

fitted_draws for rstanarm models, on the other hand, uses rstanarm::posterior_linpred, which I believe just gives you the linear predictor pushed through the link function when you set scale = "response".

Because getting the mean right depends on the model type, this isn't something I've planned to do automatically unless the underlying model supports it for you. Thus currently it works for brms and not for rstanarm.

Raoul-Kima commented 5 years ago

If I understand correctly what you're writing, then we're not talking about the same things. I think I understand that you're saying that for brms-models what fitted_draws(scale="response") is doing is basically the same as predicting with predicted_draws each combination of predictor values many times and then calculating the mean for each combination.

What I was thinking of would be achieved by the following steps: First apply fitted_draws(scale="linear"), then summarizing (e.g. mean) all draws for each parameter combination. At this point the result is not a distribution, but a single value for each parameter combination. These values would then be pushed through the link- and distribution functions, giving again a distribution. This distribution represents the distribution of data expected under the (e.g.) mean estimate.

One purpose of this would be graphical model checking in situations where plotting the full predictive distribution would make the plots too complicated / unreadable or require too much computational power.

mjskay commented 5 years ago

If I understand correctly what you're writing, then we're not talking about the same things. I think I understand that you're saying that for brms-models what fitted_draws(scale="response") is doing is basically the same as predicting with predicted_draws each combination of predictor values many times and then calculating the mean for each combination.

Yes, although it does this using analytical formulas for means of distributions instead of using Monte Carlo simulation.

What I was thinking of would be achieved by the following steps: First apply fitted_draws(scale="linear"), then summarizing (e.g. mean) all draws for each parameter combination. At this point the result is not a distribution, but a single value for each parameter combination. These values would then be pushed through the link- and distribution functions, giving again a distribution. This distribution represents the distribution of data expected under the (e.g.) mean estimate.

Ah I see what you mean now. You can do this relatively straightforwardly using a combination of the dpar argument to add_fitted_draws (to get posteriors for the parameters for the predictive distribution, such as mu and sigma---depending on the model type) plus the stat_dist_... family of stats to visualize the result. It would be a slight modification of the example from the tidy-brms vignette on creating Kruschke-style charts. That example shows many such predictive distributions:

ABC %>%
  data_grid(condition) %>%
  add_fitted_draws(m, dpar = c("mu", "sigma")) %>%
  sample_draws(30) %>%
  ggplot(aes(y = condition)) +
  stat_dist_slabh(aes(dist = "norm", arg1 = mu, arg2 = sigma), 
    slab_color = "gray65", alpha = 1/10, fill = NA
  ) +
  geom_point(aes(x = response), data = ABC, shape = 21, fill = "#9ECAE1", size = 2)

But you could change sample_draws(30) to summarise_all(mean) and remove alpha = 1/10 so that it shows just the one distribution from the mean prediction (though I'm not sure you would want to do that and ignore the uncertainty in the parameters. Personally I think the previous chart is more useful.)

ABC %>%
  data_grid(condition) %>%
  add_fitted_draws(m, dpar = c("mu", "sigma")) %>%
  summarise_all(mean) %>%
  ggplot(aes(y = condition)) +
  stat_dist_slabh(aes(dist = "norm", arg1 = mu, arg2 = sigma), 
    slab_color = "gray65", fill = NA
  ) +
  geom_point(aes(x = response), data = ABC, shape = 21, fill = "#9ECAE1", size = 2)

Personally I think you wouldn't want to do this though. Either the previous chart with multiple distributions or one that uses add_predicted_draws is better as they incorporate the full uncertainty in the parameters. E.g. this:

ABC %>%
  data_grid(condition) %>%
  add_predicted_draws(m) %>%
  ggplot(aes(y = condition)) +
  stat_slabh(aes(x = .prediction), 
    slab_color = "gray65", fill = NA
  ) +
  geom_point(aes(x = response), data = ABC, shape = 21, fill = "#9ECAE1", size = 2)

mjskay commented 5 years ago

Also I should note that the above does not take the mean of the linear predictor and then push it through the link, it pushes the posterior of the linear predictor through the link and then takes the mean. This is preferable as the transformation of a mean is not in general equal to the mean of the transformed posterior, which is what you should want.

Raoul-Kima commented 5 years ago

You can do this relatively straightforwardly using ...

That's a nice way to do it. Although it still requires some manual recoding of model parts (the distribution). I was thinking about a more automated version, but since I readin you answer above that the predictions are not done by tidybayes itself but relying on brms/rstanarm/etc I now think my implementation idea is not feasible.

Personally I think you wouldn't want to do this though. Either the previous chart with multiple distributions or one that uses add_predicted_draws is better as they incorporate the full uncertainty in the parameters. E.g. this:

In general I agree, but in some situations having multiple distributions is not needed and makes the plot too complicated. Similarly, using add_predicted_draws and just throwing the samples of all draws together works well in many cases, but the distribution it creates is in principle not expected to be the same as the data distribution (as it includes parameter uncertainty), so sometimes it can be confusing.

mjskay commented 5 years ago

That's a nice way to do it. Although it still requires some manual recoding of model parts (the distribution). I was thinking about a more automated version, but since I readin you answer above that the predictions are not done by tidybayes itself but relying on brms/rstanarm/etc I now think my implementation idea is not feasible.

Yeah... I've gone back and forth about this kind of thing here and elsewhere (there are some posterior predictive check / residual analysis things I've wanted to implement that present similar issues). The two problems are (1) it more closely couples the design of tidybayes with brms, which I am not sure I want --- I'd rather keep it more generic; and (2) I probably don't have the bandwidth to do it currently anyway.

Anyway, glad this was able to help!

mjskay / tidybayes

push posterior summary through distribution function #207