stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
104 stars 28 forks source link

Support recursive application of kernel transformations. #342

Open arporter opened 5 years ago

arporter commented 5 years ago

When transforming kernels for e.g. OpenACC, we currently only transform a single kernel at a time. However, it is possible for a kernel to call other routines (accessed via module use statements) and, in the current implementation, these other routines will not be transformed (and thus won't be marked-up for OpenACC compilation). In this issue we will identify such routines and ensure that they are also transformed as necessary.

arporter commented 11 months ago

I'm hitting this in the LFRic GPU work now. As a basic first problem, ACCRoutineTrans.validate is failing to spot situations where a kernel calls other routines.

              # Procedures called in a compute region must have acc routine
              # information - interpolate_to_regular_grid
              # (polyv_wtheta_koren_kernel_0_mod.f90: 102)
              "wt_advective_update_alg_mod_psy",

              # Mod variables in acc routine
              "galerkin_projection_algorithm_mod_psy",
              # Mod variables in acc routine
              "physics_mappings_alg_mod_psy",

              # Module variables used in acc routine need to be in acc
              # declare create() - chi2llr$sd
              # Procedures called in a compute region must have acc
              # routine information - coordinate_jacobian_evaluator_r_double
              "physical_op_constants_mod_psy",
              # Procedures called in a compute region must have acc
              # routine information - interpolate_to_regular_grid
              "reconstruct_w3_field_alg_mod_psy",
arporter commented 11 months ago
NVFORTRAN-W-1054-Module variables used in acc routine need to be in acc declare create() - geometry (compute_geopotential_kernel_0_mod.f90: 59)
compute_geopotential_0_code:
 75, FMA (fused multiply-add) instruction(s) generated
 76, FMA (fused multiply-add) instruction(s) generated
 77, FMA (fused multiply-add) instruction(s) generated
 79, Accelerator restriction: call to 'xyz2llr' with no acc routine information
arporter commented 11 months ago
if (coord_system == coord_system_xyz) then
  do k = 0, nlayers-1
    do dfc = 1, ndf_chi
      chi_1_e(dfc) = chi_1( map_chi(dfc) + k)
      chi_2_e(dfc) = chi_2( map_chi(dfc) + k)
      chi_3_e(dfc) = chi_3( map_chi(dfc) + k)
    end do

    do df = 1, ndf_w3
      coord(:) = 0.0_r_def
      do dfc = 1, ndf_chi
        coord(1) = coord(1) + chi_1_e(dfc)*chi_basis(1,dfc,df)
        coord(2) = coord(2) + chi_2_e(dfc)*chi_basis(1,dfc,df)
        coord(3) = coord(3) + chi_3_e(dfc)*chi_basis(1,dfc,df)
      end do
      call xyz2llr(coord(1), coord(2), coord(3), lon, lat, radius)

with use coord_transform_mod, only : xyz2llr at the module level.

arporter commented 11 months ago

Currently, validate only checks the symbol table of the kernel subroutine and thus misses the import.

arporter commented 10 months ago

Validation is now working OK and we can compile and run Gung Ho :-) However, we then of course have a lot of white space because of the kernels we can't transform because of this issue. This issue itself has two different strands: the first where we know that another Routine is being called from a Kernel and the second where we have an imported Symbol that then turns out to be a Function call. We can't currently deal with the latter because of #2416.

arporter commented 10 months ago

As is so often the way, it turns out that, in practice, we can't currently deal with the first strand either because the important cases all correspond to polymorphic routines - i.e. the routines are implemented for different precisions and are put behind an interface.

arporter commented 9 months ago

I'm implementing basic support for a new 'GenericInterfaceSymbol' in #2422 and want to check that what I've done will actually be useful in practise.

We apply ACCRoutineTrans to every kernel that appears in an invoke to which we are adding OpenACC directives. This results in a modified (and renamed) version of the kernel being written to file. The LFRIc build system then discovers this kernel, builds it and links it in. The next step is to identify Calls and, rather than raise an Exception, transform the target routines in exactly the same way.

arporter commented 9 months ago

I've realised there's a problem with my proposed solution above - we end up transforming source files other than the immediate PSy-layer and associated Kernel source. We currently have no way of managing that cleanly. e.g. We would have to create a new, renamed source file containing a renamed module and the Kernel source would have to be updated to use that new module.

arporter commented 9 months ago

CodedKern has the rename_and_write() method which does this renaming via a private _rename_psyir() method. This is already fairly general but won't cope with routines accessed via generic interfaces. Also, it is obviously currently part of CodedKern and therefore isn't available in language-level PSyIR which is what a Kernel body (and any called routines) use. This problem is already the subject of #1013.

arporter commented 9 months ago

We can't (yet) use KernelModuleInlineTrans as this doesn't work for language-level PSyIR (this is the subject of #924 and #2413).