Closed jorgensd closed 2 months ago
Consider the following MWE:
import ufl import basix.ufl c_el = basix.ufl.element("Lagrange", "triangle", 1, shape=(2, )) mesh = ufl.Mesh(c_el) el = basix.ufl.element("Lagrange", "triangle", 2) V = ufl.FunctionSpace(mesh, el) u, v = ufl.TrialFunction(V), ufl.TestFunction(V) a = ufl.inner(u, v) * ufl.dx forms = [a]
which generates
// ------------------------ // Section: Jacobian // Inputs: coordinate_dofs, FE1_C0_D10_Q39d // Outputs: J_c0 double J_c0 = 0.0; { for (int ic = 0; ic < 3; ++ic) { J_c0 += coordinate_dofs[(ic) * 3] * FE1_C0_D10_Q39d[0][0][0][ic]; } } // ------------------------ // ------------------------ // Section: Jacobian // Inputs: coordinate_dofs, FE1_C1_D01_Q39d // Outputs: J_c3 double J_c3 = 0.0; { for (int ic = 0; ic < 3; ++ic) { J_c3 += coordinate_dofs[(ic) * 3 + 1] * FE1_C1_D01_Q39d[0][0][0][ic]; } } // ------------------------ // ------------------------ // Section: Jacobian // Inputs: coordinate_dofs, FE1_C1_D01_Q39d // Outputs: J_c1 double J_c1 = 0.0; { for (int ic = 0; ic < 3; ++ic) { J_c1 += coordinate_dofs[(ic) * 3] * FE1_C1_D01_Q39d[0][0][0][ic]; } } // ------------------------ // ------------------------ // Section: Jacobian // Inputs: coordinate_dofs, FE1_C0_D10_Q39d // Outputs: J_c2 double J_c2 = 0.0; { for (int ic = 0; ic < 3; ++ic) { J_c2 += coordinate_dofs[(ic) * 3 + 1] * FE1_C0_D10_Q39d[0][0][0][ic]; } }
Some of these loops should probably be merged (even if then get unrolled by the compiler).
I guess the issue is that the scopes are not optimized for constant prepart sections.
prepart
Consider the following MWE:
which generates
Some of these loops should probably be merged (even if then get unrolled by the compiler).