BAMresearch / bayem

Implementation and derivation of "Variational Bayesian inference for a nonlinear forward model." [Chappell et al. 2008] for arbitrary, user-defined model errors.
MIT License
2 stars 1 forks source link

Parameters as vector #8

Closed joergfunger closed 3 years ago

joergfunger commented 3 years ago

In some applications, the parameter vec consist of individual components that are are in itself a vector, e.g. when interpreting the displacement vector in a FEM computation as an unknown. Any idea on how we are going to handle this?

joergfunger commented 3 years ago

We could handle the parameters as individual classes instead of doubles, i.e. we could store in the dictionary a BaseParameter class that is then either a scalar or a vector or whatever someone needs. And have methods such as number of entries get/set from vector (to convert this into the parameter vector) In this class, we could then also store additional information such as the prior distribution.

TTitscher commented 3 years ago
TTitscher commented 3 years ago

Followup design question: If my problem has two latent parameters, one scalar and one vector of length N, should the "length" of my latent parameters be 2 or N+1?

ajafarihub commented 3 years ago

I think, according to @joergfunger 's suggestion, it should be 2. Or ... ?

joergfunger commented 3 years ago

The latent parameter has length N+1, the parameter list has two entries.

ajafarihub commented 3 years ago

I am almost concluding that, the input parameters of the forward model's callable must be a long N§1 vector, where N is the number of total latent parameters. I think, it is not wise to pass this information (all latent parameters) to the forward model in any other form than a N§1 vector, since this would complicate the interpretation/construction of the Jacobian with respect to the latent parameters. If for example the input is a 2*1 vector, whose each entry is by itself another vector/array, then we would have to establish our Jacobian as a higher-dimensional tensor, which IMO could not be preferred to a full 2-D array. This could also be inconsistent with typical optimizer routines (like scipy.optimize) which most often cannot handle such a higher-dimensional tensor of Jacobian.

joergfunger commented 3 years ago

why would this complicate the interpretation of the Jacobian. If you compute the Jacobian with respect to a single parameter (this could be a scalar and a vector) and then store it accordingly, it should be straightforward to assemble this (similar to the full latent vector that is a function of the individal named parameters) into the full matrix. This also makes it much more clear when you compute the Jacobian what terms you are actually computing. And if you add another parameter, you get rid of all the clumsy keeping track of what index your matrix starts. And for the computation, you would compute each block Jacobian separately anyway (the computation of the derivative is usually done per parameter since those are in most cases different computations).

ajafarihub commented 3 years ago

Imagine a forward model with: number of material parameters = 3 number of random admissible displacements = 1000 number of residuals (in the output of forward model) = 1000 number of admissible displacements comparable with measurments (in the output of forward model) = 800 . What shape would the Jacobian have? IMO, it must have the shape 1800§1003 .

If we consider any other nested format, it might become unclear for the optimizer to relate the component of that nested Jacobian to the individual components of the input latent parameters.

ajafarihub commented 3 years ago

Just to address the point maybe a bit more clearly, the issue is not related to the computation or storage of Jacobian by itself or the interpretation of it for the user/developer. Rather, the issue IMO is, that the optimizer must be able to handle/interpret this Jacobian in relation to the input latent parameters.

Furthermore, I think this issue is even a bit beyond the scope of the Jacobian, i.e. given even an optimizer which would compute the Jacobian numerically (like using central difference method), I am not sure if it is possible to introduce latent parameters in any kind of nested structure to such an optimizer. For example, as far as I can see, scipy.optimize can only recognize a simple long vector as the latent parameters.

Like often, please correct me if I am missing something :) .

TTitscher commented 3 years ago

There are two levels: "named" parameters as convenient input to the models and "vector" parameters as input to the optimize/inference algorithms. The "JointLatent" class currently handles the conversion. A similar thing is going on for the outputs. So in your case, the model input would be a dict like

"E" : 6174 
"nu": 0.2  
"k0" : 0.042
"disp" : [0.2, 0.1, ... ] # len 1000

and the output could be a dict like

"residual (sensor)" : [0.0, 0.1, ...] # len 1000
"disp error (sensor)" : [ ... ] # len 800

The "JointLatent" converts the optimization/inference algorithm input vector of length 1003 to the input dict of the model error and somewhere (currently also in the model error) the output dict of the model error is concatenated to a output vector of length 1800.

We have not thought much about the Jacobian and I am glad that you do. My first idea would be to have the model error return the individual parts of the jacobian as

me.Jacobian("residual", "E") #  1000x1
me.Jacobian("residual, "disp") # 1000x1000
me.Jacobian("disp error", "k0") # 800x1

that are combined as needed.

ajafarihub commented 3 years ago

Followed by some private chat with @TTitscher , I prepared a design for the base classes of forward model and model error. IMO, we can keep such a general workflow/design for any forward-model. Note that, this is only for ONE model/error.

base_classes_fw_me.txt

joergfunger commented 3 years ago

Related to the Jacobian - I would agree with the above individual contributions (which is what you compute anyway, all individual). Assembling them if needed can from my point of view be done on some higher leve in the multimodel error (or inference problem). So I'm not sure what the advantage of the assembler class that is given with the forward model would be, because e.g. when combining different expriments, you would have again to reassemble everything and decompose your matrix into the individual components because e.g. a shared parameter in exp.1 is at another position in the local Jacobian than the same parameter in experiment two). So I'm stillvery much in favor of the approach with the individual components. Assembling them at some point should be handled by the Multimodelerror class, and then extracting a full matrix instead of individual components should be possible, so you can directly use this then in any optimizer that can just work with arrays.

ajafarihub commented 3 years ago

You are right and I see the point of having multi-model-errors featuring shared latent parameters. In such cases, it is wise to postpone the assembly of the Jacobian to the higher level. In this respect, I adjusted the classes design (in the attachment), so that the user can decide which Jacobian to use (un-assembled "jac " vs. assembled "JAC"). For the case of a single model error, one would be able to directly use self.JAC . For a multi-model error scenario however, one would postpone the assembly into the multi-model error as you also pointed out. But still it seems to me that we would need to take the assembled jacobian of individual model erros and try to re-assemble them again. I am not 100% sure of it thouh.

On the other hand, generally, the logic behind any possible user-computed Jacobian (like analytical) must be IMO provided for each individual forward model, and this "logic" is now structured in two parts each being provided by a routine:

fw_me_base_2.txt

TTitscher commented 3 years ago

Added in #15 or #14.