Closed kynan closed 12 years ago
Here's the proposal for fixing it:
The partitioner in the generate-transforms branch doesn't produce the correct output because it was never correctly designed in the first place - a heuristic that simply worked for all the expressions we had thrown at it so far was in use. This is a proposal for a partitioner that actually works.
The old partitioner works by traversing the AST from the top down. When a Sum node is encountered, it checks to see if both operands have the same set of indices used by their IndexSum nodes. If they do have the same indices, it considers the entire sum to be one partition, since the entire expression fits inside exactly one loop nest. This behaviour needs to be generalised so that Products are treated in this way as well - for an example of why this is required, consider the multiplication by detJ in the expression tree that takes place in the generate-transforms branch. In this example, a Product node is at the top of the AST with one operand which will correspond to what the user wrote the form as, which potentially has multiple indices, and the other operand is detJ, which has no indices.
This change to the partitioner will cause it to return a list of individual sub-expressions of the whole integrand that will all fit inside exactly one loop. We need to be able to produce an expression that combines all of the results of these individual expressions for summing into the local tensor correctly. This requires keeping track of where the subexpressions are in the AST, and providing a way to uniquely identify them. This can be done by creating a new AST node, called SubTree, that inherits from Expression and Counted, so that it can be fitted into a UFL AST, and each SubTree will have a unique ID. When the partitioner searches for partitions in an AST, it will insert SubTree nodes at each point where a partition is created.
In order to actually generate the correct code to implement the expression, we proceed as follows:
Argh, I think I just found an issue with this - user-defined classes that derive from Expr don't work with Transformers, because they don't get added to all_ufl_classes when ufl.init is called.
This was fixed in the generate-transforms-experimental branch before it was merged (dfc0aa469f). The Transformer issue was overcome by creating our own Transformer that is a subclass of the UFL Transformer.
In its current form, the
Partitioner
only acts on sums if they're at the root of the expression tree and splits the expression into partitions. When multiplying the entire form expression byDeterminant(J)
, the root of the tree is always a product and hence thePartitioner
will always only find a single partition.When
detJ
was evaluated as a common subexpression before the loop, thePartitioner
could sensibly act on the remaining expression and multiplying the common subexpressions to each partition.