BAMresearch / bayem

Implementation and derivation of "Variational Bayesian inference for a nonlinear forward model." [Chappell et al. 2008] for arbitrary, user-defined model errors.
MIT License
2 stars 1 forks source link

Analytical example #25

Closed joergfunger closed 2 years ago

joergfunger commented 3 years ago

In order to test our methods (and maybe also to investigate the problem of the free enery being maximal in the first iteration), I think it might be advantageous to have an analytical test case, i.e. not prescribing noise on the data and then identify, but rather prescribe a linear model with a few data points, then derive the analytical solution (the exact posterior). Using VB we should then be able to exactly reproduce the result (even though it might not yield the exact mean) and we can even test the cov.

TTitscher commented 3 years ago

Check Chappel tutorial paper for an example. @aradermacher can you provide a link for that?

joergfunger commented 3 years ago

In 3.4 and 3.5 here a derivation for the normal distribution can be found.

TTitscher commented 3 years ago

To start the process, I try to summarize: The example in section 3.4 has a data set of n=30 entries with a given N(x, σ). Also there is prior knowledge N(M, τ). The posterior mean of (σ²M+τ²nx)/(nτ²+σ²) and variance (σ²τ²)/(nτ²+σ²) is derived analytically.

My idea would be to define a random dataset X with 29 entries according to N(x,σ) and pick the last entry such that the mean and variance is exactly met.

The we define the model error k(theta) = np.full(shape=30, fill_value=theta) - X and run VB. The posterior should then match the analytic values.

Would that be correct?

joergfunger commented 3 years ago

Why do you want to compute the last sample separately - just sample from some distribution (all 30), and then compute the sample mean variance?

TTitscher commented 3 years ago

I thought that it would be nice to exactly match the values from the book. But, indeed, your idea would be much simpler -- and would almost match the book values.

TTitscher commented 3 years ago

I coded the example from 3.4 here and the parameter posterior matches the analytic solution. Section 3.4.1 in the book is, I think, about updating the noise. Who is going to look into that?

joergfunger commented 3 years ago

Was there anyone volunteering to look into the analytical example @ajafarihub or @ic-lima ?

ajafarihub commented 3 years ago

I think, you @joergfunger mean an example, where we have both of mean and std to be identified (section 3.4.1. of the book chapter you posted), since Thomas has already provided the simpler example (only mean as unknown). To me it is not fully clear how to do that. I see for example that in the book chapter, eq. (3.4) gives us a joint distribution for our two parameters (mean and std), but the question is: how to compare this with the VB results? I am not sure, but maybe we would need - after using eq. (3.4) - to marginalize this joint distribution to obtain separate distributions for either of mean and std. Would that make sense?

Another way would be to forget about this formula (eq. 3-4) and compute the posteriors by means of a sampling method, which would directly give us individual distributions for each of mean and std.

TTitscher commented 3 years ago

On page 67 of the book above, they derive an analytic solution sigma² ~ InvGamma(a, b) with formulas for a and b. I'm sure that this could be related to the precision of VB that is precision = 1. / sigma² ~ Gamma(shape, scale). That would be my starting point.

ajafarihub commented 3 years ago

I am not quite sure ... . We know that, the counterpart of sigam^2 (of the book) in the VB method is inv(Lambda). But, the inverse gamma distribution mentioned in the book is under an important assumption: "if μ is considered fixed". So IMO, it is a conditional distribution and therefore is not the counterpart of Lambda in the VB, which should be a marginalized distribution I think. This difference is also quite clear from another fact that, the precision (of parameters) in the VB has a normal distribution (not Gamma).

joergfunger commented 3 years ago

There is a subsequent chapter in the book, where bot mu and sigma are identified.

ajafarihub commented 3 years ago

I prepared this for the case of having both unknown mean and sigma. It is based on the marginalization of eq. (3.4) in the document we referred to above.