firedrakeproject / tsfc

Two-stage form compiler
Other
15 stars 24 forks source link

Cache Error causes entire code to break #236

Closed medinaeder closed 2 years ago

medinaeder commented 3 years ago

The error is extremely reproducible with in my code (combination of fireshape and periodic-bcs) but the MFEs that I have tried coming up don't reproduce the error.

tryexcept_adjoint_bug.txt

I think the cache error is linked to another error I often get. When I try to run the same script producing the error above twice I get the standard

ValueError: All coefficients must be defined on the same space

I worked around this by calling firedrake-clean everytime I want to rerun the script.

I have a feeling all this is caused by the periodic mesh hack.

wence- commented 3 years ago

This exception that you posted

    a, _ = argument_multiindices
ValueError: not enough values to unpack (expected 2, got 0)

Indicates that you (or something in the loop) is attempting to do assemble(a, diagonal=True) but a is not a bilinear form.

Looking at the traceback, this comes from the adjoint evaluation

    adj_sol, adj_sol_bdy = self._assemble_and_solve_adj_eq(dFdu_form, dJdu, compute_bdy)
  File "/home/e/Software/firedrake/firedrake-dev-20210111-mpich/src/pyadjoint/dolfin_adjoint_common/blocks/solving.py",
line 171, in _assemble_and_solve_adj_eq
    self.compat.linalg_solve(dFdu, adj_sol.vector(), dJdu, *self.adj_args, **self.adj_kwargs)

So we're assembling some form dFdu and then assembling it as a matfree matrix, but somehow the arguments are being lost.

Can you drop into a debugger (run with ipython --pdb script.py) and walk up the stack frames until you can inspect these things?

ValueError: All coefficients must be defined on the same space

This error is raised by the pointwise assignment code when you're attempting to do an assignment f.assign(expr) and expr contains coefficients (other than Constant objects) that are on different function spaces from f.

If there's no MFE, is it possible to make the actual failing code available (with a recipe for reproduction)?

medinaeder commented 3 years ago

@wence- I think your comment cleared up why my MFE's were not failing. Check out the code below. It does not yield the same error but I think it is the root of the issue.

from firedrake import *
from firedrake_adjoint import *
import numpy as np

mesh = UnitSquareMesh(4,4)

sf = {"mat_type": "aij",
      "snes_type": "nrichardson",
      "snes_max_it": 1}
sp = {"mat_type": "matfree",
      "snes_type": "ngmres",
      "snes_max_it": 2,
      "npc_snes_side": "right",
      "npc":"newtonls",
      "npc_ksp_type": "cg",
      "npc_pc_type": "lu"}

# Forward problem that fails with a try except
def forward(mu):
    V = FunctionSpace(mesh, "CG", 1)
    u = Function(V)
    v = TestFunction(V)
    u_holder = Function(V)

    a = inner(mu*grad(u),grad(v))*dx
    l = inner(Constant(1),v)*dx
    R = a-l
    bcs = DirichletBC(V, Constant(0), "on_boundary")
    J=0;
    u_holder.assign(u)
    try:
        print("Trying")
        solve(R==0, u, bcs, solver_parameters=sf)
    except ConvergenceError:
        try:
            print("Failing")
            u.assign(u_holder)
            solve(R==0,u,bcs, solver_parameters=sp) # Fails
            #solve(R==0,u,bcs) # Default works
            J+=assemble(u*u*dx)
        except ConvergenceError:
            J+= assemble(np.nan*dx(domain=mesh))

    return J

C = FunctionSpace(mesh, "DG",0)
mu = Function(C).interpolate(Constant(1.))
J = forward(mu) 
control = Control(mu)
Jhat = ReducedFunctional(J, control)
dJdctrl = Jhat.derivative()

File("test.pvd").write(dJdctrl)

If I am not mistaken I think I need a way of passing linear solver parameters to the adjoint problem. Adjoint problems are linear and as a result the solve that takes place behind the scenes is expecting that structure.

wence- commented 3 years ago

Sorry, I completely dropped the ball here. I think you are right that this is a pyadjoint issue. Do you mind reporting over there? https://github.com/dolfin-adjoint/pyadjoint/

wence- commented 2 years ago

I think this is resolved?