Closed jeremylt closed 4 years ago
Sorry to butt in @jeremylt , but I'd been puzzling over this since @YohannDudouit described the error behavior he was seeing while he was here at ICL last week. I was also getting his non-tensor branch of MFEM set up so we can use it to test non-tensor developments in the MAGMA backend, and I decided to test something.
It looks like the tensor contraction for the gradient in ceed-ref-basis.c is using the component as the outermost index, but the qfunction is expecting only Q points between the components. That would make the dimension the outermost component, instead. It wasn't obvious to me how to easily test this within the current tensor contraction routine without messing anything up for interp, so I just took the factor of dim out of Q/P, put in a loop for the dimension and offset u or v as necessary, plus the gradient matrix for the element, like
case CEED_EVAL_GRAD: {
CeedInt P = nnodes, Q = nqpt;
CeedInt dimstride = nqpt * ncomp * nelem;
CeedInt gradstride = nqpt * nnodes;
CeedScalar *grad;
ierr = CeedBasisGetGrad(basis, &grad); CeedChk(ierr);
if (tmode == CEED_TRANSPOSE) {
P = nqpt; Q = nnodes;
for (int d = 0; d < dim; d++){
ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
grad + d * gradstride, tmode, add, u + d * dimstride, v);
}
}
else{
for (int d = 0; d < dim; d++){
ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
grad + d * gradstride, tmode, add, u, v + d * dimstride);
}
}
CeedChk(ierr);
}
This theory also might explain why the mass matrix worked, since the gradient involved there is for the mesh/Jacobian and ncomp = dim, while doing diffusion in ex1 we have ncomp=1.
When I made this change and tried mfem ex1 on the "beam-tri" and "escher-p3" meshes (triangles and tets), I get the same answer for "ceed-cpu" and "ceed-cuda:/gpu/cuda/ref" device options.
@nbeams, thank you! I think you are correct. Do you want to put together a PR for this? We'd also need a test to catch this. I only have a few small style changes I'd suggest:
case CEED_EVAL_GRAD: {
CeedInt P = nnodes, Q = nqpt;
CeedInt dimstride = nqpt * ncomp * nelem;
CeedInt gradstride = nqpt * nnodes;
CeedScalar *grad;
ierr = CeedBasisGetGrad(basis, &grad); CeedChk(ierr);
if (tmode == CEED_TRANSPOSE) {
P = nqpt; Q = nnodes;
for (CeedInt d = 0; d < dim; d++) {
ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
grad + d * gradstride, tmode, add,
u + d * dimstride, v); CeedChk(ierr);
}
} else {
for (CeedInt d = 0; d < dim; d++) {
ierr = CeedTensorContractApply(contract, ncomp, P, nelem, Q,
grad + d * gradstride, tmode, add,
u, v + d * dimstride); CeedChk(ierr);
}
}
}
Sure, I'll open a PR for this change, and we'll worry about adding a new test later? Or did you want to work on a test for this in the same PR?
I think a modification of my proposed test t324 to add multiple components would work. I can check. https://github.com/CEED/libCEED/pull/376
Oh wait, I've never opened a PR here before. Are there rules/guidelines about which people I should request as reviewers/how many reviewers need to approve? (I went ahead and added the new branch natalie/non-tensor-test with the changes to ceed-ref-basis.c)
You can add whoever you think may be interested. At least Jeremy in this case. We only require one approval to merge, but get more for bigger changes.
No particular rules for who to request. I usually tag jedbrown, valeriabarra, and YohannDudouit, as the 4 of us are the most active contributors.
Fixed in #378
On @YohannDudouit's branch https://github.com/mfem/mfem/tree/yohann/nontensor, CPU backends appear to be having an issue with applying the Laplaican in MFEM ex1.
I have no theories on what the issue could be, as the CPU basis code is only applying a mat-vec in that case.