firedrakeproject / firedrake

Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM)
https://firedrakeproject.org
Other
504 stars 159 forks source link

InterpolateBlock doesn't stash interpolators #1962

Open ReubenHill opened 3 years ago

ReubenHill commented 3 years ago

The adjoint InterpolateBlock creates a new interpolator every time it is evaluated, e.g. for calculating the hessian.

Extra interpolators should probably be stashed on the Interpolator as necessary - though this is perhaps not the only solution. I believe @nbouziani has worked on a similar fix elsewhere - comments welcome.

nbouziani commented 3 years ago

Sorry for the delay! I have commented what I think is the same issue, cf. #1981.

ReubenHill commented 3 years ago

(Closed the duplicate issue)

nbouziani commented 3 years ago

I have not dealt with similar issues in #1674 since the interpolate is part of the external operator evaluation process (happens inside the external operator), that is there is no additional InterpolateBlock induced.

Just to be sure I understand correctly: You want to stash Interpolator(expr, space) right ? Is it not just when you interpolate via Interpolator(expr, space).interpolate(...) that there is a computational cost ? If yes, then I don't understand the problem with defining a new Interpolator every time something needs interpolating.

ReubenHill commented 3 years ago

If expr and space are the same multiple times (as I think typically they will be when doing hessian evaluations as part of a firedrake_adjoint.minimize(Ĵ, method='Newton-CG')) you really ought to stash and reuse the Interpolator to save compile time costs

nbouziani commented 3 years ago

Ok I see. If this is just for adjoint purposes, then the information about which interpolator has already been made could also be stashed directly to the InterpolateBlock class. Although I am not sure which option is the most judicious.

dham commented 3 years ago

That doesn't quite work, because you get a new InterpolateBlock for every interpolate operation.

nbouziani commented 3 years ago

I meant having a kind of cache mechanism via shared attributes as it is done for forms with TSFCKernel.

dham commented 3 years ago

Global caches for interpolation matrices are going to get very expensive very fast.

nbouziani commented 3 years ago

Ok I see.

wence- commented 3 years ago

But kernel caching should be implemented for interpolation as we do for form assembly

dham commented 3 years ago

We should do that. However we can also avoid reassembling the matrices associated with a single interpolator object.

wence- commented 3 years ago

If we tie the lifetime of these things to the lifetime of the object in the forward model then that should DTRT.

wence- commented 3 years ago

Actually, that doesn't work

ReubenHill commented 3 years ago

Why doesn't it work?

wence- commented 3 years ago

Imagine you have the following setup:

def forward(mesh):
    # set up forward model
    interpolator = Interpolator(...)
    while ...:
        interpolator.interpolate()
    return something_that_is_not_the_interpolator

J = forward(mesh)

compute_gradient(J)

So the interpolator doesn't live past the end of the forward model run, but we need it in the adjoint if that's where we're stashing the other thing.

ReubenHill commented 3 years ago

Is this an edge case where stashing on the Interpolator would at least help?

Another idea, how about a global cache but only for the Interpolators in the InterpolateBlock to avoid the expense of caching all interpolators all the time?

wence- commented 3 years ago

Is this an edge case where stashing on the Interpolator would at least help?

No? The interpolator is the thing we're trying to keep alive.

Another idea, how about a global cache but only for the Interpolators in the InterpolateBlock to avoid the expense of caching all interpolators all the time?

How do you know this has happened?

ReubenHill commented 3 years ago

How do you know this has happened?

By doing the caching in the InterpolateBlock, then when a new InterpolateBlock is made, it looks in the cache of existing Interpolate objects for InterpolateBlocks to see if there's one it can reuse

wence- commented 3 years ago

How do those get collected?

ReubenHill commented 3 years ago

Well I don't know the details how caching is done at the moment, but I'm imagining some runtime-only global dictionary that's filled via an annotation of an Interpolate

wence- commented 3 years ago

what I mean is, how do you decide to throw those interpolate objects away from the interpolateblock?

ReubenHill commented 3 years ago

I don't think I understand your question? Do you mean getting rid of the cached objects? Presumably if they lived in a dictionary in some suitably global scope, they would only exist as long as a script took to run?

wence- commented 3 years ago

That sounds like too long. Each Interpolator, if it is reusable, carries a sparse matrix around with it.

ReubenHill commented 3 years ago

Solution: Cache all pyop2 Interpolation kernels created in firedrake/interpolation.py based on a hash of expression and function space (@wence- please correct if wrong)

wence- commented 3 years ago

Yes, I think that is right. ufl.core.Expr needs to gain a .signature() method that is similar to that for ufl.Form (which is stable wrt coefficient numbering). Then we can do something similar to the TSFCKernelCache stuff.

ReubenHill commented 3 years ago

Do function spaces have a suitable hash?

wence- commented 3 years ago

No, but their UFL element does (or should).