Closed cboettig closed 4 years ago
I'm looking into calculating KL divergence to quantify prior to posterior learning. For priors, we have theoretical pdfs, but for posteriors, we may only have MCMC simulations. Of all the available functions in R, they either calculate differences between two pdfs by KL definition or two sets of random numbers/realizations by search algorithms. Here's a summary:
FNN::KL.divergence
function calculates two sets of random numbers, using different nearest neighbor search algorithms.LaplacesDemon::KLD
function calculates based on pdf.philentropy::KL
function calculates based on pdf or count (un-normalized pdf).spatialEco::kl.divergence
function calculates based on pdf.Unless we want to estimate pdf for MCMC ourselves, it seems we can only use FNN::KL.divergence
. For priors, we may have to generate random numbers from the theoretical distributions.
@cboettig
:+1: KL divergence seems like a reasonable way to quantify this. Like you say, given that we just have mcmc samples for the posterior, the FNN
method seems to be the way to go. @henryrscharf Does that sound reasonable to you? Seems like this would be a somewhat common thing to quantify, are you aware of any prior art for doing this conveniently?
Affirmative, this is a reasonable thing to do to quantify how much learning one gets about particular parameters. I think the FNN method sounds fine, and a reasonably large sample from the prior distribution should provide a good estimate of the KL divergence. Alternatively, we could use the empirical distribution generated from the MCMC-derived samples from the posterior (I'm guessing the FNN function is doing this for both samples). Either of these approaches probably yield more or less identical KL values, so do whatever is most convenient, I say.
One thing you might do is check that the computed KL values don't change when you provide different realizations from the prior or posterior distributions--that would tell you the sample sizes are large enough to get good estimates of KL divergence. If that doesn't make sense, I can try to clarify further.
Great work!
Henry
On Wed, Jul 24, 2019 at 11:36 AM Carl Boettiger notifications@github.com wrote:
👍 KL divergence seems like a reasonable way to quantify this. Like you say, given that we just have mcmc samples for the posterior, the FNN method seems to be the way to go. @henryrscharf https://github.com/henryrscharf Does that sound reasonable to you? Seems like this would be a somewhat common thing to quantify, are you aware of any prior art for doing this conveniently?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cboettig/nimbios-transients/issues/5?email_source=notifications&email_token=ABS5QYSWUURQ5EHGIKPUTPTQBCHKLA5CNFSM4HR3T2Y2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XCD3Y#issuecomment-514728431, or mute the thread https://github.com/notifications/unsubscribe-auth/ABS5QYWF4FN5FL4F6DG3YLTQBCHKLANCNFSM4HR3T2YQ .
-- Henry Scharf Statistics Department - Colorado State University (520) 360-9579 henryrscharf@gmail.com www.stat.colostate.edu/~scharfh
@henryrscharf I wrote a new script to calculate the KL divergence from the simulated priors and your posterior samples. Let me know if it makes sense. code/scharf_nimble/kl_divergence.R
@jrreimer I've calculated KL divergence, as shown in the new Table 2 in Overleaf. The results seem to make sense, where KL increases (prior-posterior difference) as more samples are used.
@KaiZhuPhD Excellent. Thanks! Can you write us a few sentences in the relevant sections (Methods & Results) about what you did? And maybe toss a quick point form note to Lizzie in the Discussion about what the results mean?
@jrreimer OK, done. See Overleaf. Happy New Year!
@KaiZhuPhD