Slate tensor assembly fails for high order Nedelec elements

marybarker commented 4 years ago

Using slate's Tensor(form) functionality results in a compilation error for degree >4 N1curl elements. The error gets thrown during assemble(Tensor(form)) but does not occur when using the usual assemble(form). The output from the example below is attached. (Note: it runs perfectly for degree =1,2,3,4)

def solve_curl_curl(mesh, f, degree, with_tensor=False):
    V_element = FiniteElement("N1curl", mesh.ufl_cell(), degree)
    V = FunctionSpace(mesh, V_element)
    u = TrialFunction(V)
    v = TestFunction(V)

    a = inner(curl(u), curl(v))*dx
    L = inner(f, v)*dx

    if with_tensor:
        A = Tensor(a)
        B = Tensor(L)
        w = A.inv * B
        u_h = assemble(w) # throws an error if degree > 4
    else:
        A = assemble(a)
        B = assemble(L)
        w = Function(V)
        solve(a == L, w, bcs=[DirichletBC(V, 0, "on_boundary")])
    print("FINISHED with degree %d"%degree)

mesh = UnitTetrahedronMesh()
solve_curl_curl(mesh, Constant((1,1,1)), 5)
solve_curl_curl(mesh, Constant((1,1,1)), 5, True)

output.txt

dham commented 4 years ago

That's an interesting bug! When we get compiler errors like this, it's useful to see the compiler error log. The error message you posted tells us that this is: /Users/marvin/firedrake/.cache/pyop2/b9/2e8ab04bb76df1a5a87f9223799e0e_p1228.err

marybarker commented 4 years ago

Oh yes! Here it is. Looks like it's a problem with an Eigen macro maybe

2e8ab04bb76df1a5a87f9223799e0e_p1228.err.txt

wence- commented 4 years ago

Sorry, fat-fingered the button. The issue here is that Eigen is complaining that we're asking to stack-allocate a tensor with 19600 entries, this is larger than the internal eigen variable EIGEN_STACK_ALLOCATION_LIMIT. I do not know how easy it is to change that. Since we know up front the size of the tensor, we could heap allocate ones that are too big (although I do not know exactly how to do this with Eigen).

dham commented 4 years ago

OK, so the local tensor is getting too big for Eigen. This is not a huge surprise as local tensor size in 3d is (p+1)**6 where p is polynomial, degree. Using any matrix-explicit algorithm on high order elements is very bad for this reason. @sv2518 is currently working on refactoring the insides of SLATE to remove the Eigen dependency and introduce a matrix-free capability for high order.

I should also note that the polynomial basis for all of our elements on tets becomes very badly conditioned at high order too.

dham commented 4 years ago

Perhaps you might like to start a conversation on slack about what you're trying to achieve and we can see if we can point you at a way to do that in Firedrake.

marybarker commented 4 years ago

Thanks! I was actually just trying to use Slate for the speed enhancement it gives my code, but I can go back to using a full assemble and solve with Firedrake.

dham commented 4 years ago

OK. You should be aware that the high cost in memory and operations of assembling matrices is a universal feature of high order discretisations. It's the reason that matrix-free approaches are often preferred at high order. Firedrake does have matrix-free capabilities so if that might be useful to you, get in contact and we'll try to help.

rckirby commented 4 years ago

One possibility for higher-order Nedelec would be to use a Schwarz-type preconditioner. We'd need to generalize the low-order PC (on the eventual to-do anyway list if I ever find time...) but patchPC should be up to the task of doing the low-order problems. One could hit the low-order part of the operator with LU or you could use the hypre AMS hookup that hasn't been committed.

So there are options but...it's a bit complicated.

On Tue, Dec 10, 2019 at 3:20 AM David A. Ham notifications@github.com wrote:

OK. You should be aware that the high cost in memory and operations of assembling matrices is a universal feature of high order discretisations. It's the reason that matrix-free approaches are often preferred at high order. Firedrake does have matrix-free capabilities so if that might be useful to you, get in contact and we'll try to help.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/firedrakeproject/firedrake/issues/1560?email_source=notifications&email_token=AA4XUL2B3BGXJTOLGQY5ZB3QX5NPLA5CNFSM4JVHJB22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGORFMA#issuecomment-563942064, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4XUL4IN3IVGP7PBTABVP3QX5NPLANCNFSM4JVHJB2Q .

marybarker commented 4 years ago

Awesome! Thanks. At the moment I'm just working with some toy examples in high high order for convergence rates. I was trying to use Slate instead of the PETSc preconditioners, as it's so much quicker, but it looks like I'll have to take that hit.

rckirby commented 4 years ago

Yes, the built-in PETSc preconditioners all fail badly on H(curl) problems -- they are designed for other kinds of problems. Development will be needed for trying out Schwarz-type preconditioners.

In my paper with Marie & Anders (10 years ago!), we just used direct solvers on these matrices and waited. Or at least Marie did -- she conducted the experiments.

On Thu, Dec 12, 2019 at 9:00 AM marybarker notifications@github.com wrote:

Awesome! Thanks. At the moment I'm just working with some toy examples in high high order for convergence rates. I was trying to use Slate instead of the PETSc preconditioners, as it's so much quicker, but it looks like I'll have to take that hit.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/firedrakeproject/firedrake/issues/1560?email_source=notifications&email_token=AA4XULYA7CZ2EWDRKWAS6VTQYJGXXA5CNFSM4JVHJB22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGW522Y#issuecomment-565042539, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4XULZVAKVCQEVUJJNY5PLQYJGXXANCNFSM4JVHJB2Q .

wence- commented 4 years ago

@marybarker If you only need to do single-element things, you can get the equivalent of the slate thing by asking for:

solve(a == L, u, bcs=bcs, solver_parameters={"ksp_type": "preonly", "pc_type": "lu"})

If you want to do things on a bigger mesh and have a regular geometry (such that you can use a MeshHierarchy), then we can do Arnold-Falk-Winther HCurl multigrid out of the box with PCPatch.

rckirby commented 4 years ago

How does the high-order MG work vis-a-vis doing Schwarz and multilevel only on the low-order subspace (besides taking considerably less code and development)? My knowledge of MG details is rather sketchy, but IIRC Bank/Dupont discuss higher-order methods and show you need different scaling in smoothers. But that may just be parameter fiddling.

On Fri, Dec 13, 2019 at 3:56 AM Lawrence Mitchell notifications@github.com wrote:

@marybarker https://github.com/marybarker If you only need to do single-element things, you can get the equivalent of the slate thing by asking for:

solve(a == L, u, bcs=bcs, solver_parameters={"ksp_type": "preonly", "pc_type": "lu"})

If you want to do things on a bigger mesh and have a regular geometry (such that you can use a MeshHierarchy), then we can do Arnold-Falk-Winther HCurl multigrid out of the box with PCPatch.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/firedrakeproject/firedrake/issues/1560?email_source=notifications&email_token=AA4XUL2SIQRXMGSRUP2DF23QYNL5LA5CNFSM4JVHJB22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGZPVPQ#issuecomment-565377726, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4XUL2V3QKVIXK63UDPTFTQYNL5LANCNFSM4JVHJB2Q .

wence- commented 4 years ago

How does the high-order MG work vis-a-vis doing Schwarz and multilevel only on the low-order subspace (besides taking considerably less code and development)? My knowledge of MG details is rather sketchy, but IIRC Bank/Dupont discuss higher-order methods and show you need different scaling in smoothers. But that may just be parameter fiddling.

Our experiments (we have not gone to very high order) suggest that the overlapping patch smoothers are h and k independent when used as smoothers. I think this aligns with the abstract theory.

sv2518 commented 4 years ago

@marybarker Just to give you an update: I made a bit of progress in this. My Slate compiler is using loopy rather than Eigen now. I have tested your example and it is possible to assemble your forms also with higher order. This is still work in progress, however. I will let you know as soon as you can solve your problem matrixfree.

wence- commented 4 years ago

That work of @sv2518 is now merged, so at least the example provided now runs fine.

firedrakeproject / firedrake

Slate tensor assembly fails for high order Nedelec elements #1560