cboettig / nimbios-transients

Example data that may or may not contain transients
MIT License
2 stars 0 forks source link

Quantify learning from the data (KL divergence prior -> posterior) #5

Closed cboettig closed 4 years ago

cboettig commented 5 years ago

@KaiZhuPhD

KaiZhuPhD commented 5 years ago

I'm looking into calculating KL divergence to quantify prior to posterior learning. For priors, we have theoretical pdfs, but for posteriors, we may only have MCMC simulations. Of all the available functions in R, they either calculate differences between two pdfs by KL definition or two sets of random numbers/realizations by search algorithms. Here's a summary:

Unless we want to estimate pdf for MCMC ourselves, it seems we can only use FNN::KL.divergence. For priors, we may have to generate random numbers from the theoretical distributions.

KaiZhuPhD commented 5 years ago

@cboettig

cboettig commented 5 years ago

:+1: KL divergence seems like a reasonable way to quantify this. Like you say, given that we just have mcmc samples for the posterior, the FNN method seems to be the way to go. @henryrscharf Does that sound reasonable to you? Seems like this would be a somewhat common thing to quantify, are you aware of any prior art for doing this conveniently?

henryrscharf commented 5 years ago

Affirmative, this is a reasonable thing to do to quantify how much learning one gets about particular parameters. I think the FNN method sounds fine, and a reasonably large sample from the prior distribution should provide a good estimate of the KL divergence. Alternatively, we could use the empirical distribution generated from the MCMC-derived samples from the posterior (I'm guessing the FNN function is doing this for both samples). Either of these approaches probably yield more or less identical KL values, so do whatever is most convenient, I say.

One thing you might do is check that the computed KL values don't change when you provide different realizations from the prior or posterior distributions--that would tell you the sample sizes are large enough to get good estimates of KL divergence. If that doesn't make sense, I can try to clarify further.

Great work!

Henry

On Wed, Jul 24, 2019 at 11:36 AM Carl Boettiger notifications@github.com wrote:

👍 KL divergence seems like a reasonable way to quantify this. Like you say, given that we just have mcmc samples for the posterior, the FNN method seems to be the way to go. @henryrscharf https://github.com/henryrscharf Does that sound reasonable to you? Seems like this would be a somewhat common thing to quantify, are you aware of any prior art for doing this conveniently?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cboettig/nimbios-transients/issues/5?email_source=notifications&email_token=ABS5QYSWUURQ5EHGIKPUTPTQBCHKLA5CNFSM4HR3T2Y2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XCD3Y#issuecomment-514728431, or mute the thread https://github.com/notifications/unsubscribe-auth/ABS5QYWF4FN5FL4F6DG3YLTQBCHKLANCNFSM4HR3T2YQ .

-- Henry Scharf Statistics Department - Colorado State University (520) 360-9579 henryrscharf@gmail.com www.stat.colostate.edu/~scharfh

KaiZhuPhD commented 4 years ago

@henryrscharf I wrote a new script to calculate the KL divergence from the simulated priors and your posterior samples. Let me know if it makes sense. code/scharf_nimble/kl_divergence.R

KaiZhuPhD commented 4 years ago

@jrreimer I've calculated KL divergence, as shown in the new Table 2 in Overleaf. The results seem to make sense, where KL increases (prior-posterior difference) as more samples are used.

jrreimer commented 4 years ago

@KaiZhuPhD Excellent. Thanks! Can you write us a few sentences in the relevant sections (Methods & Results) about what you did? And maybe toss a quick point form note to Lizzie in the Discussion about what the results mean?

KaiZhuPhD commented 4 years ago

@jrreimer OK, done. See Overleaf. Happy New Year!