Potential bug in our F formula

BAMresearch / bayem

Implementation and derivation of "Variational Bayesian inference for a nonlinear forward model." [Chappell et al. 2008] for arbitrary, user-defined model errors.

MIT License

2 stars 1 forks source link

Potential bug in our F formula #54

Closed TTitscher closed 2 years ago

TTitscher commented 3 years ago

In Chappell's paper, F contains the terms:

F = ... + (N/2 + c0 -1)stuff + ... - (N/2 + c -1)stuff

Our F formula contains the terms

F = ... + (N/2 + c0 -1)stuff + ... - (c -1)stuff

which, using the update equation c = N/2 + c0, would expand to

F = ... + (N/2 + c0 -1)stuff + ... - (N/2 + c0 -1)stuff

and cancel out. Which one is correct?! :worried:

TTitscher commented 3 years ago

Hint: Changing only this term in our F equation causes almost all tests to fail.

joergfunger commented 3 years ago

I think we had a long discussion on this difference and at that time agreed that the derivation in the paper was wrong.

TTitscher commented 3 years ago

For sure! But is ours correct...? In anyway, we should simplify the F formula and may have a look into the correctness of the ARD.

TTitscher commented 2 years ago

Apart from checking and double-checking the formulas, is there a way to test the free energy?

We already have an analytical test case, so we know that the KL divergence from true posterior to our approximated posterior is zero (right??). That means that evidence = free energy, right? @aradermacher ? :smile:

aradermacher commented 2 years ago

Yes, I would guess, that this should be correct as long as the KL divergence is really zero. Can we check that?

TTitscher commented 2 years ago

Update on the original question:

The terms only cancel out, if we actually perform the noise update. This may also relate to this comment of @joergfunger who suggests the double check or adapt the free energy formula for the case of a skipped noise update.

But IMO the case of no noise update quite nicely covered by having a super narrow gamma prior

g = vb.Gamma.FromMeanStd(1/fixed_sd**2, 1.e-8)