mjskay / tidybayes

Bayesian analysis + tidy data + geoms (R package)
http://mjskay.github.io/tidybayes
GNU General Public License v3.0
718 stars 59 forks source link

LKJ with parse_dist() #191

Closed andrewmcaleavey closed 5 years ago

andrewmcaleavey commented 5 years ago

Hi, thanks for all your work on this great package! I am wondering if the new parse_dist() function is supposed to work with LKJ priors or not. I get an error when I run the following:

c(prior(lkj_corr_cholesky(1), class = L)) %>% 
parse_dist(prior) %>% 
ggplot(aes(y = class, dist = .dist, args = .args)) + 
stat_dist_halfeyeh() + 
labs(x = NULL)

Specifically, the warning is:

> Warning message:
> Computation failed in `stat_dist_slabinterval()`:
> object 'qlkj_corr_cholesky' of mode 'function' was not found

That's all. It works just fine with other distributions I've tried (Normal, Student t). Sorry if this isn't the right venue for this!

mjskay commented 5 years ago

Thanks! This is definitely the right venue.

The new stat_dist_... functions take a heuristic approach to finding distribution functions: basically they just take the distribution "name" and look for the functions dname(), qname(), etc to calculate intervals and densities. So the way that parse_dist() works is to translate common distribution names into names that have those functions defined: e.g. "Normal" gets translated into "norm" by parse_dist(), and then stat_dist_...() is able to find dnorm() and qnorm(), which it can use to generate the visualization.

There are two complications with an LKJ distribution that make it so that this is not currently supported:

  1. I don't know of a package providing all the necessary functions for the LKJ distribution. If you know of one, I'd be happy to figure out adding support.

  2. Even given these functions, because the LKJ distribution is a distribution of matrices (i.e. it is not univariate), I'm not sure exactly the best way to treat it... it doesn't really fit with the stat_dist_... family, which is designed for visualizing univariate distributions. One option might be to create dlkjcorr_marginal(), qlkjcorr_marginal(), etc that can give marginal distributions for a cell in the correlation matrix, or something like that. would have to think more on this to come up with something sensible (and I'm open to suggestions).

andrewmcaleavey commented 5 years ago

This is a very helpful explanation and is about what I would have thought. I was just kind of blindly hoping there was some secret recipe you had tucked away.

I don't really know any implementations of it at all. Everyone always just creates empirical distributions, no one defines the quantiles. rethinking::rlkjcorr() and rethinking::dlkjcorr() exist, but not qlkjcorr(). Not sure about anything else.

Not sure if this is worth creating a unique warning message for or not, but given how often LKJ is used by default, it might come up again? Eventually perhaps there will be a solution, say by offering to simulate 1000 draws and smooth them, but that seems suboptimal. That is probably what I'll end up doing for my own specific current task, though.

Thanks again!

mjskay commented 5 years ago

Turns out the marginal distribution for a single cell in an LKJ correlation matrix is pretty straightforward --- has a Beta distribution that depends on the eta parameter and the dimension of the correlation matrix. I added it on the lkjcorr_marginal branch here: https://github.com/mjskay/tidybayes/tree/lkjcorr_marginal

However I'm not quite sure how to integrate it into the parse_dist API yet, as it requires knowing the dimension of the correlation matrix, which brms::prior() does not give you. I like the idea of generating a warning here, and maybe coming up with a simple way for people to add that information in order to visualize that prior.

In the meantime I'd be curious if the code on that branch is helpful.

mjskay commented 5 years ago

Alright, I think I've come up with a reasonable solution. Basically, I added the functions for an lkjcorr_marginal distribution, which takes two parameters: K, the dimension of the correlation matrix, and eta, the same eta parameter as the LKJ distribution takes. lkjcorr_marginal is then the marginal distribution for the correlation in a single cell for a KxK correlation matrix drawn from an LKJ(eta) distribution.

Then to make it easier to use, I've added a marginalize_lkjcorr function which modifies any lkjcorr distributions in a spec returned by parse_dist() to set the size of the correlation matrix:

brms::prior(lkj(3)) %>%
  parse_dist(prior) %>%
  # this changes the lkjcorr distribution spec to an lkjcorr_marginal distribution
  # spec for a KxK correlation matrix where K = 2
  marginalize_lkjcorr(K = 2) %>%
  ggplot(aes(y = prior, dist = .dist, args = .args)) +
  stat_dist_halfeyeh() +
  xlim(-1, 1) +
  xlab("Marginal correlation for LKJ(3) prior on 2x2 correlation matrix")

image

Let me know if that works for your use case / if you have any problems.

andrewmcaleavey commented 5 years ago

This rocks and would totally work for my case. I can't explore fully until maybe tomorrow but it's 100% what I was hoping for.

I don't think there's a problem with having a second function required, though it would be nice if it was just one. Probably not worth the confusion that hiding that step would cause, as long as the documentation is clear.

mjskay commented 5 years ago

Great!

Yeah, I considered doing this automatically, but I am worried that would lead to mistakes --- since the marginal distribution depends on the correlation matrix size and I can't grab that automatically, any default I choose for K would be wrong for some people. For the new solution, if a user leaves marginalize_lkjcorr() out slab_dist_... raises a warning that points them to the documentation of marginalize_lkjcorr(), which has an example like above showing how to use it --- hopefully that will help people find the solution.

andrewmcaleavey commented 5 years ago

Finally got around to testing this. It rocks! Very helpful. I'm not sure that there is a simpler way to visualize the LKJ distribution and the effect of the correlation matrix size on that distribution - and I've looked around a bit now. I'm smarter and happier than I was before, thanks again.

One option that might be worth implementing - at some point, if it is indeed possible - would be to hide the K parameter in parse_dist() as a default NULL parameter, but then call on it when the LKJ prior is used. If it is still NULL at execution, an error message would be enough to point to the mistake. That might not be confusing for people while still maintaining the safeguard. I see merit in requiring the second function too, especially because it already works.

I'm not sure whether this would work with how parse_dist() is implemented, nor how nicely this would play in more complex implementations than what I'm trying (e.g., visualizing many types of priors simultaneously), so not worth re-opening the issue for my sake :). Just thought I'd mention.

Many thanks!

mjskay commented 5 years ago

Great, glad it works!

I will think about merging with parse_dist, though I am leaning against it --- the reason being, if any other distributions come up that have issues like this, I don't want to support them all within one function as the interface will get convoluted.

The other reason being something else I've just realized, which is that the current implementation of marginalize_lkjcorr doesn't work if you have a single table that has multiple different lkjcorr dists in it where you want to assign different values of K to different ones. However, it should be easy to extend marginalize_lkjcorr to support that use case (I'll just add a predicate to select the desired rows to modify, see #192), but it would be harder to extend parse_dist to do so while keeping the interface simple.

Thanks again for raising this issue!