CEED / libCEED

CEED Library: Code for Efficient Extensible Discretizations
https://libceed.org
BSD 2-Clause "Simplified" License
197 stars 46 forks source link

MFEM ex1 broken for CPU non-tensor #377

Closed jeremylt closed 4 years ago

jeremylt commented 4 years ago

On @YohannDudouit's branch https://github.com/mfem/mfem/tree/yohann/nontensor, CPU backends appear to be having an issue with applying the Laplaican in MFEM ex1.

image

I have no theories on what the issue could be, as the CPU basis code is only applying a mat-vec in that case.

nbeams commented 4 years ago

Sorry to butt in @jeremylt , but I'd been puzzling over this since @YohannDudouit described the error behavior he was seeing while he was here at ICL last week. I was also getting his non-tensor branch of MFEM set up so we can use it to test non-tensor developments in the MAGMA backend, and I decided to test something.

It looks like the tensor contraction for the gradient in ceed-ref-basis.c is using the component as the outermost index, but the qfunction is expecting only Q points between the components. That would make the dimension the outermost component, instead. It wasn't obvious to me how to easily test this within the current tensor contraction routine without messing anything up for interp, so I just took the factor of dim out of Q/P, put in a loop for the dimension and offset u or v as necessary, plus the gradient matrix for the element, like

     case CEED_EVAL_GRAD: {
      CeedInt P = nnodes, Q = nqpt;
      CeedInt dimstride = nqpt * ncomp * nelem;
      CeedInt gradstride = nqpt * nnodes;
      CeedScalar *grad;
      ierr = CeedBasisGetGrad(basis, &grad); CeedChk(ierr);
      if (tmode == CEED_TRANSPOSE) {
        P = nqpt; Q = nnodes;
        for (int d = 0; d < dim; d++){
          ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
                                         grad + d * gradstride, tmode, add, u + d * dimstride, v);
        }
      }
      else{
        for (int d = 0; d < dim; d++){
          ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
                                       grad + d * gradstride, tmode, add, u, v + d * dimstride);
        }
      }
      CeedChk(ierr);
    }

This theory also might explain why the mass matrix worked, since the gradient involved there is for the mesh/Jacobian and ncomp = dim, while doing diffusion in ex1 we have ncomp=1.

When I made this change and tried mfem ex1 on the "beam-tri" and "escher-p3" meshes (triangles and tets), I get the same answer for "ceed-cpu" and "ceed-cuda:/gpu/cuda/ref" device options.

jeremylt commented 4 years ago

@nbeams, thank you! I think you are correct. Do you want to put together a PR for this? We'd also need a test to catch this. I only have a few small style changes I'd suggest:

     case CEED_EVAL_GRAD: {
      CeedInt P = nnodes, Q = nqpt;
      CeedInt dimstride = nqpt * ncomp * nelem;
      CeedInt gradstride = nqpt * nnodes;
      CeedScalar *grad;
      ierr = CeedBasisGetGrad(basis, &grad); CeedChk(ierr);
      if (tmode == CEED_TRANSPOSE) {
        P = nqpt; Q = nnodes;
        for (CeedInt d = 0; d < dim; d++) {
          ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
                                         grad + d * gradstride, tmode, add,
                                         u + d * dimstride, v); CeedChk(ierr);
        }
      } else {
        for (CeedInt d = 0; d < dim; d++) {
          ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
                                         grad + d * gradstride, tmode, add,
                                         u, v + d * dimstride); CeedChk(ierr);
        }
      }
    }
nbeams commented 4 years ago

Sure, I'll open a PR for this change, and we'll worry about adding a new test later? Or did you want to work on a test for this in the same PR?

jeremylt commented 4 years ago

I think a modification of my proposed test t324 to add multiple components would work. I can check. https://github.com/CEED/libCEED/pull/376

nbeams commented 4 years ago

Oh wait, I've never opened a PR here before. Are there rules/guidelines about which people I should request as reviewers/how many reviewers need to approve? (I went ahead and added the new branch natalie/non-tensor-test with the changes to ceed-ref-basis.c)

jedbrown commented 4 years ago

You can add whoever you think may be interested. At least Jeremy in this case. We only require one approval to merge, but get more for bigger changes.

jeremylt commented 4 years ago

No particular rules for who to request. I usually tag jedbrown, valeriabarra, and YohannDudouit, as the 4 of us are the most active contributors.

jeremylt commented 4 years ago

Fixed in #378