Calculating differential entropy/mutual information

joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences

https://dynesty.readthedocs.io/

MIT License

347 stars 76 forks source link

Calculating differential entropy/mutual information #198

Closed andyfaff closed 4 years ago

andyfaff commented 4 years ago

First of all, many thanks for creating a wonderful package, very very useful.

I have a feature request (or it may already exist in some form but I don't know about it), is it/would it be possible to use dynesty to calculate the differential entropy or mutual information of a system? I'm interested in this because it can provide information of the relevance of the data to the parameters, and can be used in experimental design to maximise the expected amount of information obtained from the data.

There is a paper, https://arxiv.org/pdf/1707.03543.pdf, that uses nested sampling to achieve this. I'm therefore wondering if it's possible to build in this kind of functionality to dynesty (ideally in parallel with a normal sampling run)?

joshspeagle commented 4 years ago

dynesty currently computes an estimate of the KL divergence from the prior to the posterior (see here) which is saved in the results dictionary as information. I don't think this isn't quite the same thing as the differential entropy or mutual information here, but is it at least a step in the right direction? I can also look into whether this type of thing can be computed with some function that post-processes the samples.

andyfaff commented 4 years ago

Unfortunately I'm not an expert in the area. I do my best to try and understand what's going on, but it's a constant uphill struggle with my maths background.

andyfaff commented 4 years ago

The background to my question is provided by this paper.

joshspeagle commented 4 years ago

"The information gain ΔH is defined as the difference between the entropy H(Θ) of the prior PDF p(θ), representing the knowledge before the experiment, and the entropy H(Θ | y) of the posterior PDF p(θ | y), obtained after the measurement yielded a particular experimental outcome y ∈ Y"

Is this what you're hoping to compute? If I'm understanding this correctly, it should technically be possible to compute this by post-processing the output dynesty results.