gridap / GridapODEs.jl

Time stepping for Gridap
MIT License
33 stars 7 forks source link

Add Autodiff machinery for transient FE operators #29

Closed santiagobadia closed 3 years ago

santiagobadia commented 3 years ago

@fverdugo @oriolcg following the developments for automatic computation of jacobians from residuals (or even residuals from energy functionals) in Gridap, discussed in this issue, we should extend it in GridapODEs for transient problems.

I would say that we need (at least) to create a transient version of FETermsWithAutodiff, e.g., TransientFETermsWithAutodiff, following what is done in this file.

In any case, it is probably better to wait till we have these dev's in the main branch in Gridap

oriolcg commented 3 years ago

Hi @santiagobadia @fverdugo. I started the implementation of TransientFETermsWithAutoDiff in the transient_autodiff branch. It's already working for the heat equation, see the test. However, there are a couple of issues: 1) I had to rename the FETerm function to avoid confusion with the function that generates a NonlinearFETermWithAutoDiff and I called FETerm_t https://github.com/gridap/GridapODEs.jl/blob/e413b3d6a631651134f6a4c499f0b55e4dd7d391/src/TransientFETools/TransientFETermsWithAutoDiff.jl#L7 I don't know if this is a good solution, what do you think? 2) I tried with the Stokes problem and it doesn't work. Is there anything that I need to take into account for multiField jacobians?

santiagobadia commented 3 years ago

I don't like the name FETerm_t. I guess the same problem is in Gridap, isn't it? How is it solved there?

I would probably consider keeping the auto diff term with FETerm and change the name of the other one. Or call it TransientFETerm and call both the hand-written and auto diff versions the same way, now there is no interface collision, is it?

@fverdugo can answer the Stokes part.

fverdugo commented 3 years ago

Hi @oriolcg

The navier stokes tutorial works with autodiff by simply replacing FETerm(res,jac,...) by FETerm(res,...). Thus, it should work for (steady-state) stokes as well.

fverdugo commented 3 years ago

I would compute jacobian and jacobian_t with and without autodiff and see if they match. Perhaps there is some problem with the scaling of jacobian_t, just guessing.

oriolcg commented 3 years ago

It's already working for the transient Heat equation, so I assume that the jacobian_t should be fine at least for scalar problems. The error that I get for the transient Stokes test is the following:

LoadError: MethodError: no method matching Gridap.Arrays.BlockArrayCoo(::Array{Array{T,2} where T,1}, ::Array{Tuple{Int64,Int64},1}, ::Tuple{BlockArrays.BlockedUnitRange{Array{Int64,1}},BlockArrays.BlockedUnitRange{Array{Int64,1}}}, ::Array{Int64,2}, ::Array{Array{T,2} where T,1})

Since it's related to a BlockArray issue, I was wandering if we need to do something special for multifields. Did you face something similar when implementing the autodiff for the steady version?

fverdugo commented 3 years ago

Yes, this looks specific to multi-field or to skeleton faces. But AD seems to work for the steady case.

oriolcg commented 3 years ago

@fverdugo, I've been investigating the issue, but I still don't have a clear idea of what's happening. If I don't include the term q*(div(u)) into the residual, it computes the jacobian and jacobian_t . Of course, the problem is singular, but at least I get the jacobian_t. However, as soon as I add this term it throws me the error reported on the earlier message.

This is the residual I'm using:

function res(t,x,xt,y)
  u,p = x
  ut,pt = xt
  v,q = y
  #inner(∇(u),∇(v)) + inner(ut,v) - (∇⋅v)*p + q*(∇⋅u)
  inner(∇(u),∇(v)) + inner(ut,v) - (∇⋅v)*p 
end

Do you have any idea of what could be happening?

I attach a small reproducer in case you want to play with:

using Gridap
using ForwardDiff
using LinearAlgebra
using Test
using GridapODEs.ODETools
using GridapODEs.TransientFETools
using Gridap.FESpaces: get_algebraic_operator

# using GridapODEs.ODETools: ThetaMethodLinear
import Gridap: ∇
import GridapODEs.TransientFETools: ∂t

θ = 0.5

u(x,t) = VectorValue(x[1],x[2])*t
u(t::Real) = x -> u(x,t)
p(x,t) = (x[1]-x[2])*t
p(t::Real) = x -> p(x,t)

domain = (0,1,0,1)
partition = (2,2)
model = CartesianDiscreteModel(domain,partition)
order = 2

V0 = FESpace(
  reffe=:Lagrangian, order=order, valuetype=VectorValue{2,Float64},
  conformity=:H1, model=model, dirichlet_tags="boundary")
Q = TestFESpace(
  model=model,
  order=order-1,
  reffe=:Lagrangian,
  valuetype=Float64,
  conformity=:H1,
  constraint=:zeromean)

U = TransientTrialFESpace(V0,u)
P = TrialFESpace(Q)
X = TransientMultiFieldFESpace([U,P])
Y = MultiFieldFESpace([V0,Q])

trian = Triangulation(model)
degree = 2*order
quad = CellQuadrature(trian,degree)

function res(t,x,xt,y)
  u,p = x
  ut,pt = xt
  v,q = y
  #inner(∇(u),∇(v)) + inner(ut,v) - (∇⋅v)*p + q*(∇⋅u)
  inner(∇(u),∇(v)) + inner(ut,v) - (∇⋅v)*p 
end

U0 = U(0.0)
P0 = P(0.0)
X0 = X(0.0)
uh0 = interpolate_everywhere(u(0.0),U0)
ph0 = interpolate_everywhere(p(0.0),P0)
xh0 = interpolate_everywhere([uh0,ph0],X0)

t_Ω_ad = FETerm_t(res,trian,quad)
#t_Ω_ad = FETerm(res,jac,jac_t,trian,quad)
op = TransientFEOperator(X,Y,t_Ω_ad)

t0 = 0.0
tF = 1.0
dt = 0.1

ls = LUSolver()
odes = ThetaMethod(ls,dt,θ)
solver = TransientFESolver(odes)

sol_t = solve(solver,op,xh0,t0,tF)

result = Base.iterate(sol_t)
fverdugo commented 3 years ago

@oriolcg Does it work for the steady state version of this problem?

oriolcg commented 3 years ago

Yes, it does. Steady scalar, steady multi-field and transient scalar are working fine

fverdugo commented 3 years ago

Fails in this function ?

https://github.com/gridap/GridapODEs.jl/blob/e413b3d6a631651134f6a4c499f0b55e4dd7d391/src/TransientFETools/TransientFETermsWithAutoDiff.jl#L37

In which line?

oriolcg commented 3 years ago

Yes, this is the call stack:

\src\Arrays\VectorsOfBlockArrayCoo.jl:31
\src\Arrays\VectorsOfBlockArrayCoo.jl:15 (repeats 2 times)
\src\Fields\FieldOperations.jl:370`
\src\Fields\FieldArrays.jl:275
...
\Arrays\Apply.jl:61
\src\Arrays\Autodiff.jl:80
\src\Arrays\Autodiff.jl:45
\src\FESpaces\FEAutodiff.jl:31
\src\TransientFETools\TransientFETermsWithAutoDiff.jl:49
oriolcg commented 3 years ago

For some reason, when going into this line: https://github.com/gridap/Gridap.jl/blob/1dfc66234202cd70a39f07cf5ffbf9ad192fdd5d/src/Arrays/VectorsOfBlockArrayCoo.jl#L31

the jacobian of the steady part has blockids = [(1,1)] with blocks an Array of size 1 filled with ForwardDiff duals, and the transient jacobian has blockids = [(1,1),(1,2)] with blocks an Array of size 2 filled with ForwardDiff duals and an array of floats.

Does it look good to you @fverdugo ?

fverdugo commented 3 years ago

It seems it is trying to differentiate wrt pt and it fails

oriolcg commented 3 years ago

Yes, definetively the problem seems to be there. I tried to split the residual res = res_x + res_t and construct the jacobians with respect to res_x and res_t, respectively. If I do that, I don't have any contribution from the pressure block (neither test nor trial) in the temporal residual and it works. However, this is not a good solution because you won't always have residuals that can be splitted. I think I will need help to fix this issue, because I'm not familiar with the new structure of the blocks and I'm getting lost.

oriolcg commented 3 years ago

Hi @fverdugo,

I've been digging a bit more into this issue and I think I have an idea of what could be happening, but, if this is the case, I'm not completely sure how to fix it.

The problem appears when we have a binary operation, let's say +, between two terms corresponding to two different blocks and only one of the terms contains a variable with respect to which we differentiate. To put a simple example, this happens when we have a residual of the form:

function res(t,x,xt,y)
  u,p = x
  ut,pt = xt
  v,q = y
  (ut ⋅ v) + q*(∇⋅u)
end

In that case, when applying the operator + the two bocks are of different type. One will be an array of ForwardDiff.Dual and the other an array of Float64. Then, when it reaches line https://github.com/gridap/Gridap.jl/blob/c792576180e0bf1be0c6685d1d693b31f11d4f71/src/Arrays/VectorsOfBlockArrayCoo.jl#L31, after performing the collect operation, the type of blocks_i is Array{Array{T,2} where T,1}, which cannot be casted as a Vector{A} required in the BlockArrayCoo constructor. When working with the static case (Stokes or NS) all blocks are of the same type and typeof(blocks_i) = Array{Array{ForwardDiff.Dual{Nothing,Float64,22},2},1} which fits into Vector{A}. This is why we don't see this error in the static case nor in the transient with a single block.

Does it make sense? If this is the reason it should be easy to fix, but I don't know what's the best strategy to solve this issue. Any idea?

oriolcg commented 3 years ago

@santiagobadia @fverdugo, I got stuck with this issue. If the blocks cannot support different types (see issue https://github.com/gridap/Gridap.jl/issues/430) I don't see a way forward other than sending two different residuals to the TransientFETerm, one for the unknown and the other for the time derivative.

I would like to avoid it because it doesn't allow coupling between unknown and time derivative, but I don't see an alternative. Can you think on a different approach?

santiagobadia commented 3 years ago

@oriolcg @fverdugo now that we do not have FETerm anymore, do you think that we can update this discussion? Is the showstopper you @oriolcg mention :point_up: still there? Probably yes, but manifested in a different way

santiagobadia commented 3 years ago

@oriolcg is there anything still open here? can we close the issue?

oriolcg commented 3 years ago

With the newer version I didn't face any of the issues reported in this issue. I'm closing it.