The CDE sub-pass detects lowered index derivatives and can operate across Clusters (using a clusters.Queue), whereas CSE is applied to individual Clusters.
We should enhance CSE to be as smart as CDE and tweak things such that we can get rid of CDE and use CSE everywhere instead
Why is this needed? consider:
// Cluster 0
B = ... A ...
// Cluster 1
for i = 0 to N
r += ... B ...
// Cluster 2
C = ... r ... A ...
First of all, we observe that the data dependencies here prevent any sort of topological reordering. Then, we see that A appears twice -- in both Cluster 0 and Cluster 2. We want to improve CSE such that it can catch A across these Clusters.
The CDE sub-pass detects lowered index derivatives and can operate across Clusters (using a
clusters.Queue
), whereas CSE is applied to individual Clusters.We should enhance CSE to be as smart as CDE and tweak things such that we can get rid of CDE and use CSE everywhere instead
Why is this needed? consider:
First of all, we observe that the data dependencies here prevent any sort of topological reordering. Then, we see that A appears twice -- in both Cluster 0 and Cluster 2. We want to improve CSE such that it can catch A across these Clusters.