CeedOperatorLinearAssembleQFunction[PointBlock]Diagonal

Ok, turns out that for diagonal assembly we're only using the point block diagonal of the assembled QFunction, so we should add CeedOperatorLinearAssembleQFunctionPointBlockDiagonal

Need to investigate a bit, but a point block diagonal assembly by component should be sufficient and would cut memory usage for our big projections (Ratel diagnostic) without talking about QF coupling yet. Though, it would need to be swapped for the full QFunction when doing assembly, such as for multi grid, so it might not actually save us much in practice.

CEED / libCEED

CeedOperatorLinearAssembleQFunction[PointBlock]Diagonal #1671