BAMresearch / bayem

Implementation and derivation of "Variational Bayesian inference for a nonlinear forward model." [Chappell et al. 2008] for arbitrary, user-defined model errors.
MIT License
2 stars 1 forks source link

Integrating jacobians from Black box forward solver with grad based UQ schemes #35

Closed atulag0711 closed 2 years ago

atulag0711 commented 3 years ago

Hello, My first issue here . yay!!. I was thinking about ideas to inform the codebase about the availability of Jacobians from the forward solver. Some kind of flag? Also the size of Jacobians is directly proportional to grid size. Can it pose a problem later on?

I am trying to build this interface in bayes-->calibration. Would be nice to have a very basic fenics based test forward solve. Maybe simple FE based load displacement simulation would also do which passes solver output and the Jacobians for a given input parameters (maybe youngs mod E).

I envision to treat this thread as an evolving log to get this implemented.

joergfunger commented 3 years ago

There is the model error that is provided. This could have implemented a Jacobian routine (if this is not present, then there is none). However, I see some challenges.

TTitscher commented 3 years ago

Implementing our bayes.inference.ModelErrorInterface will automatically provide a .jacobian method that calculates the derivative of the model error vector (model(theta) - data) with respect to theta via central differences. As demonstrated here, you can manually overwrite .jacobian to fully provide it analytically or calculate parts of it via central differences and provide the others.

Transforming the model error to a log-likelihood function L(theta) is already implemented, calculation dL / dtheta is missing, but would be nice to have!

In any ways, I prefer to keep the test cases -- at least in this repository -- very simple and without any FEniCS. You could, however, have a dummy FEM model like F(E) = E * A / L *u_measured to illustrate a FEM use case.

joergfunger commented 3 years ago

On the one hand, I agree that cluttering the dependencies of Bayes Inference with Fenics modules is not good. On the other hand, it would be nice to document an example where the adjoint solvers are used to efficiently couple fenics with gradient based inference. As far as I now, there are possibilities to have different "configurations" for the package being defined in setup.py (similar when building a conda package), thus it would be possible to have fenics included in tests but not when installing the package. Where would you add such fenics example?

TTitscher commented 3 years ago

For me, there are two separate issues: 1) As stated above, our current interface can calculate loglike L(k(theta)) with model error kand we could now implement dL/dtheta as chain rule from the already available dk/dtheta. 2) There is the option to directly calculate L(theta) and dL/dtheta via FEniCS adjoint, so without chain-ruling from an model error? Did I understood that correctly? That would require some changes to the interfaces, which should also be no problem, but is more effort and requires some planning.

So what is the scope of this issue? If both, we should maybe split it.

atulag0711 commented 3 years ago

Thanks for the replies above. As I understand, the adjoints needs to come from the forward solve. I dont understand how it can computed at a later stage in the inference package. Maybe I am missing something. Yes FEnics has a adjoint package or alternatively manually adjoints can be computer in fenics given the Jacobians and gradients of the functional (L2 error, log normal etc). So in my opinion, if we have forward solve which outputs the solver values and adjoints plus if we have obsevred values is sufficient for the grad based sampling or VI.

I think the FEnics solver that Alex is developing can be decoupled from the BayesianInf package. We can use a jupyternotebook to demonstrate it. The solver can be blackbox which takes in parameters and returns values and jacobians.

atulag0711 commented 3 years ago

Just to add a few point about CD based Jacobians IMO:

  1. choose dx too large, and the finite difference will not approximate the limit value; choose dx too small, and numerical precision will destroy the accuracy of the approximation
  2. I think the forward solvers we have here are expensive. So FD based schemes need one functional evaluation for each degree of freedom in the parameter space makes it extremely expensive and eventually impractical.
atulag0711 commented 3 years ago

If Alex is working on the FEM simulation, then I think optimal would be that his adjoint solver takes in the funtional (log likelihood in our case) and returns us dL/dE. With this information, in the inference package I will overwite the pytorch autograde to include dL/dE which will be passed to grad based sampler or VI algos. If we want to further parametrise material paramters with a Neural net, with this scheme gradiants will easily be coupled to give us dL/d\theta letting us optimise the \theta. This way we can learn a process map E = f(z;\theta) for example.

joergfunger commented 3 years ago

As for the FD you are perfectly right, but in order to test the methods it is always convenient (and working for simple test examples) to use FD. As for the functional with the adjoint solver, the problem is not that easy. What exactly is the interface, how do you expect the forward solve to return the gradient (note that the input is a dictionary, would that mean we return a DD['E'] = dL/dE for each parameter (either scalar or vector?). We would also have to adjust that to account for multiple model errors.

atulag0711 commented 3 years ago

@joergfunger Yes I agree adjoint solver is not easy, but would be crucial for real world solvers later on IMO if we want to use grad based samplers or VI (ELBO needs gradients if I am not mistaken). Yes E in dL/dE can be a vector too if we are to optimize wrt several parameters. Atm I cant think of a way for multiple model errors. But once this is implemented for sure we can think of a way to do so.

joergfunger commented 3 years ago

I would suggest to first concentrate on a simple example that we all agree on regarding the interfaces, and only then move to more complex problems (such as with gradients).

TTitscher commented 2 years ago

solved in https://github.com/BAMresearch/probeye/pull/28, I guess.