stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
104 stars 27 forks source link

(closes #2684) Fix the failure case when we have a WHERE construct with no array index notation. #2687

Closed LonelyCat124 closed 2 days ago

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.86%. Comparing base (70f9793) to head (9340fd6). Report is 35 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #2687 +/- ## ========================================== - Coverage 99.86% 99.86% -0.01% ========================================== Files 354 354 Lines 49082 49046 -36 ========================================== - Hits 49018 48982 -36 Misses 64 64 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

LonelyCat124 commented 1 month ago

@sergisiso ready for a first look, I'm not super sure on the implementation details so its possible there's some things we're not testing for that i've missed.

LonelyCat124 commented 1 month ago

I think this is ready for another look @sergisiso . The outstanding thing I've not addressed is intrinsics (which were already "forbidden" in fparser2 before this change) - I'm not sure if/when it makes sense for them to exist.

sergisiso commented 1 month ago

@LonelyCat124 See the responses above, it is a bit difficult to pinpoint exactly where the issues come from, but I provided examples of where statements that psyclone produces invalid fortran when parsing them.

sergisiso commented 1 month ago

Note that the problems that are causing issues, and the frcv(jpr_ievp)%z3(:,:,1) / picefr(:,:) style restriction that @arporter is mentioning in the other PR are already solved (or validation-refused) in the ArrayAssignment2LoopsTrans. WHERE's are just masked array assignments, so I am wondering if we could lean on that transformation to create a simpler implementation here that reuses that logic. It can not be applied directly but I was thinking we could convert:

WHERE(expr1 LOGICAL_OP expr2)
  expr3 = expr4

to:

expr1 = expr2 BINARY_OP expr3 BINARY_OP expr4 (... more BINARY_OP exprN if there are more statements)

This will create a

do bounds_expr1
   expr1 = expr2 BINARY_OP expr3 BINARY_OP expr4
end do

which then we can convert back to:

do bounds_expr1
   if expr1 LOGICAL_OP expr2 then
      expr3 = expr4
end do

I am aware that this is a totally different implementation, so maybe this PR should just identify and codeblock the issues for now. But I think this is a path to a possible solution.

LonelyCat124 commented 3 weeks ago

TODO:

LonelyCat124 commented 2 weeks ago

I think this is ready for another look - if there are other things we should test for safety let me know - or if I've still missed some cases I didn't understand.

LonelyCat124 commented 1 week ago

@sergisiso I reviewed how we handle intrinsics, and now I think it mirrors the original and correct behaviour. Arrays inside intrinsic are only expanded if they are elemental, inquiry ones we ignore (since we don't care) and others are acceptable (e.g. DOT_PRODUCT(a, b)) without being touched. The only disallowed ones are the ones that were previously disallowed (reductions with dimension arguments).

sergisiso commented 1 week ago

@LonelyCat124 wrt the failing integration test, as we discussed last week, we need to do the following change:

         # For performance in lib_fortran, mark serial routines as GPU-enabled
         if psyir.name == "lib_fortran.f90":
-            if not subroutine.walk(Loop):
-                try:
-                    # We need the 'force' option.
-                    # SIGN_ARRAY_1D has a CodeBlock because of a WHERE without
-                    # array notation. (TODO #717)
-                    OMPDeclareTargetTrans().apply(subroutine,
-                                                  options={"force": True})
-                except TransformationError as err:
-                    print(err)
+            if subroutine.name.lower().startswith("sign_"):
+                OMPDeclareTargetTrans().apply(subroutine)

In omp_gpu_trans.py, acc_loops_trans.py, and acc_kernels_trans.py. I checked that it works and doesn't degrade the performance. The changes are:

LonelyCat124 commented 6 days ago

I've set the integration tests running again now with those changes.