passlab / rexompiler

REX OpenMP Compiler
https://passlab.github.io/rexompiler/
Other
1 stars 1 forks source link

Preorder and postorder traversal for OpenMP transformation #118

Open ouankou opened 2 years ago

ouankou commented 2 years ago

The current OpenMP transformation in REX uses post-order traversal. In the following example, REX will lower simd first, then for, at last parallel.

#pragma omp parallel
#pragma omp for
#pragma omp simd
  for (int i = 0; i < 10; i++)
    ...

However, in this case, we need to lower for first before simd because the SIMD operation should be performed after the loop has been properly assigned to each thread after handling worksharing and collapsing. Thus, a pre-order traversal should be conducted for these two directives.

This is not always the case, the outliner in REX requires that the outlining code section cannot have OpenMP directives. Therefore, we have to lower for and simd first, and then parallel, which is still post-order traversal.

So far, only simd should be traversed in pre-order.

Solution:

upir_parent and upir_children have been implemented for UPIR statements. We still use original post-order traversal since it's better for avoiding iterator invalidation. If simd's upir_parent is for, REX skips its transformation, and for will handle the simd directive in its own transformation.

We decided to use the pre-order traversal instead of the original post-order traversal because it's natural to follow the execution order to perform the transformation. The top-down approach will prevent the issue related to simd transformation mentioned above. We will encounter some issues while outlining during the pre-order traversal. To resolve this problem, we may need to modify the outliner in REX. For the pre-order traversal, the implement will always lower the outermost directives and leave the rest as-is, and then repeat this process until all the directives are processed. In the example above, there will be three passes of OpenMP transformation. Pass 1 outlines parallel construct and the enclosed code region including for and simd directives will be moved to a new function in a new source file. Pass 2 lowers for construct but the simd directive is still unchanged. At last, pass 3 updates the loop by adding certain SIMD operations and complete the whole transformation.

ouankou commented 1 year ago

We decided to remove UPIR from the development branch. However, the OpenMP nodes also require a base class similar to SgUpirBaseStatement. This base class maintains the relationship between OpenMP nodes with omp_parent and omp_children. It helps the transformation quickly get the context information. Otherwise, we have to use ROSE API get_parent() to check the parent node type over and over manually.

For now, I only implemented a base class, SgOmpExecStatement, for all OpenMP executable directives. All these kinds of directives are inherited from SgStatement, so we can insert a layer between SgStatement and actual directives. Meanwhile, OpenMP declarative directives are inherited from SgDeclarativeStatement, which is a subclass of SgStatement. We can't add them to this base class. Otherwise, all SgDeclarativeStatement will belong to the OpenMP base class. A solution could be to implement a similar base class SgOmpDeclStatement using the same interface.

The implementation of SgOmpExecStatement is in commit https://github.com/passlab/rexompiler/commit/429a65fcf8d589e19348eda65b9337430dfd61b6.