gridap / Gridap.jl

Grid-based approximation of partial differential equations in Julia
MIT License
709 stars 97 forks source link

Adjoints of FEOperators #337

Open santiagobadia opened 4 years ago

santiagobadia commented 4 years ago

@fverdugo @oriolcg We want to create adjoints of FEOperators. In this sense, I propose a method like

adj_op = Adjoint(op,uh)

where op is a FEOperator, uh is the solution. We have discussed to include there the RHS, but it would be misleading, it has nothing to do with the adjoint. Instead, I would return a AffineFEOperator with zero right hand side and create a method that takes adj_op and the rhs separately.

The internal implementation of this adj_op can be:

  1. If op is a AffineFEOperator, we can consider as the most general method, to take the corresponding sparse matrix and perform a lazy Julia transpose. However, I think that this is probably not what we want for some particular implementation of the matrix (e.g., when using PETSc arrays, etc). In those cases, more concrete versions of the transpose, e.g., non-lazy versions, will be needed.

  2. If op is a FEOperatorFromTerms, to define the transpose of this operator that does exactly the same as the FEOperatorFromTerms but tranposes the local element matrix before its assembly. Since it is a minor change in the implementation of FEOperatorFromTerms, I would probably use a trait to implement this.

oriolcg commented 3 years ago

I went ahead and started the implementation of AdjointFEOperator. I pushed a first version in cdbebc173b7267860256ad9bfd148a49e8320793. You can find the new definition of an AffineFEOperator for cases 1 and 2 in https://github.com/gridap/Gridap.jl/blob/cdbebc173b7267860256ad9bfd148a49e8320793/src/FESpaces/AdjointFEOperator.jl and some tests in https://github.com/gridap/Gridap.jl/blob/cdbebc173b7267860256ad9bfd148a49e8320793/test/FESpacesTests/AdjointFEOperatorTests.jl

I'm not sure if this is the most correct/efficient way of defining the adjoints. Could you take a look and let me know what do you think, @santiagobadia @fverdugo ?

fverdugo commented 3 years ago

I would follow a quite different approach, which I think opens the door to more optimizations. From the user perspective:

a(u,v) = a(u,v) = ∫( ∇(v)⋅∇(u) )*dΩ # Some bi-linear form
l(v) = ∫( v*f )*dΩ + ∫( v*(n_Γ⋅∇u) )*dΓ # Some linear form (external loads)
ld(u) = ∫( abs2(u-u_target) )*dΩ +# Some linear form (objective function)

op_opd = AffineFEOperatorWithAdjoint(a,l,ld,U,V;selfadjoint=true) # Better name?
uh, udh = solve(op_opd) # Forward and adjoint solutions.
# If `selfadjoint=true` one can reuse the numerical setup of the solver.

op_opd stores 2 matrices and 2 vectors. When selfadjoint=true the two matrices are the same julia object. Else, one needs to assemble a matrix with this new bilinear form ad(v,u)=a(u,v) (Or by transposing the direct matrix). In parallel computations, it is potentially better to assemble the matrix than transpose it.

In the non-linear case, the idea is the same

res(u,v) = # the residual as now
jac(u,du,v) =  # the jacobian as now
ld(u) = ∫( abs2(u-u_target) )*dΩ +# Some linear form (objective function)

op_opd = FEOperatorWithAdjoint(res,jac,ld,U,V;selfadjoint=true)
uh, udh = solve(op_opd) 

uh is computed with non-linear iterations, udh is solved with a linear solver for the last Jacobian (or its transpose) from the non-linear iterations. For selfadjoint=false, instead of transposing the Jacobian I would assemble this bilinear form: ad(v,u)=jac(uh,v,u) .

fverdugo commented 3 years ago

BTW, you can always implement these geters if you want to consider forward and adjoint problems separately (at the cost of loosing the optimizations that take place when they are solved together)

op = get_forward(op_opd)
opd = get_adjoint(op_opd) # This will be always an AffineFEOperator
# or by using iteration
op, opd = op_opd
fverdugo commented 3 years ago

Transposing lazily or assembling the transpose from the bilinear form / jacobian will depend on the linear solver used for the adjoint. If the linear solver allows to use a lazy transpose efficiently, this would be the best option. Since it depends on the linear solver, at some point we need to do dispatch with it.

oriolcg commented 3 years ago

op_opd stores 2 matrices and 2 vectors. When selfadjoint=true the two matrices are the same julia object. Else, one needs to assemble a matrix with this new bilinear form ad(v,u)=a(u,v) (Or by transposing the direct matrix). In parallel computations, it is potentially better to assemble the matrix than transpose it.

@fverdugo are you thinking on two AffineOperator or defining a new struct, e.g. AffineOperatorWithAdjoint with two matrices and vectors, that has its own implementation of residual, jacobian, setup, solve, ... ?

oriolcg commented 3 years ago

For serial AffineOperators the numerical setup can also be reused: lu(A') = transpose(lu(A)), right?