BAMresearch / bayem

Implementation and derivation of "Variational Bayesian inference for a nonlinear forward model." [Chappell et al. 2008] for arbitrary, user-defined model errors.
MIT License
2 stars 1 forks source link

Inconsistent assignments to "k" and "J" #19

Closed ajafarihub closed 3 years ago

ajafarihub commented 3 years ago

Here is a simple example to illustrate the issue:

Suppose, a user has defined a model error completely regardless of VB, as follows:

me(x) = 2*x² - 2  # here, the "data" is 2 and the forward model is g(x) = 2*x²
me.jacobian(x) = 4*x

If the user wants to use the VB for this model error, the correct k and J (according to the paper) must be:

k = data - g(x) = 2 - 2*x²  = - me(x)
J = 4*x = me.jacobian(x)

And these are not what we have now in the implementation:

k, J = model_error(param0.mean), model_error.jacobian(param0.mean)

So, I think that, either "model_error(param0.mean)" or "model_error.jacobian(param0.mean)" must receive a minus in front of it.

TTitscher commented 3 years ago

I see it as a matter of defnition and the paper defines J = -dk/dtheta. That may not be intuitive, I could live with that, but you can also change it.

Just note that simply putting a minus in front of certain terms is the intuitive idea, but k and J are likely dictionaries and the minus-operator is not defined.

ajafarihub commented 3 years ago

I would say that the current implementation is unsafe, since a user of the VB will only takes care of defining his model_error, namely the two following routines:

model_error(parameters)
model_error.jacobian(parameters)

About the last point (dictionary), we can keep k and J as they are (no changes), but put a minus in front of any J[noise_key]. Sometimes it does not play any role, e.g. when we have J.T[noise_key] @ J[noise_key], but in some parts of the update equations it does matter.

aradermacher commented 3 years ago

In my examples the minus is added in the definition of the model error jacobian (for the forward model it's the original one). I guess that was the way we used that and for me that makes senses.

ajafarihub commented 3 years ago

@aradermacher , this trick is also what I did. What I am concern about now is that, this minus of the Jacobian is no longer the true Jacobian of the model error. So, this is simply "error-prone" and can cause issues if we want to introduce the same model_error to different inference schemes which are working based on the Jacobian of the model error (e.g. the VB or conjugate gradient). IMO, we need to establish things on the true Jacobian of the model error and not something else which is occasionally suitable for the VB.

TTitscher commented 3 years ago

I seem like you really want to convince us that there is a problem.

If it bugs you, fix it ;)

joergfunger commented 3 years ago

So there are essentially two Jacobians, one from the forward model, and one for the model error. If the model error is data-forward model then this should be represented there (with J_me=-J_fw). I guess Annikas remark is similar to the way I would implement this.

joergfunger commented 3 years ago

Just another remark, I would say the definition in the paper is not optimal for the implementation, because as you said k is the model error (with the minus) and J is defined as the derivative of the forward model. I would change this to the derivative of the me, but we might have to check, if this is consistent, in particular when using the VB for the residual model, since there the Jacobian is different.