Open arporter opened 5 years ago
I'm hitting this in the LFRic GPU work now. As a basic first problem, ACCRoutineTrans.validate
is failing to spot situations where a kernel calls other routines.
# Procedures called in a compute region must have acc routine
# information - interpolate_to_regular_grid
# (polyv_wtheta_koren_kernel_0_mod.f90: 102)
"wt_advective_update_alg_mod_psy",
# Mod variables in acc routine
"galerkin_projection_algorithm_mod_psy",
# Mod variables in acc routine
"physics_mappings_alg_mod_psy",
# Module variables used in acc routine need to be in acc
# declare create() - chi2llr$sd
# Procedures called in a compute region must have acc
# routine information - coordinate_jacobian_evaluator_r_double
"physical_op_constants_mod_psy",
# Procedures called in a compute region must have acc
# routine information - interpolate_to_regular_grid
"reconstruct_w3_field_alg_mod_psy",
NVFORTRAN-W-1054-Module variables used in acc routine need to be in acc declare create() - geometry (compute_geopotential_kernel_0_mod.f90: 59)
compute_geopotential_0_code:
75, FMA (fused multiply-add) instruction(s) generated
76, FMA (fused multiply-add) instruction(s) generated
77, FMA (fused multiply-add) instruction(s) generated
79, Accelerator restriction: call to 'xyz2llr' with no acc routine information
if (coord_system == coord_system_xyz) then
do k = 0, nlayers-1
do dfc = 1, ndf_chi
chi_1_e(dfc) = chi_1( map_chi(dfc) + k)
chi_2_e(dfc) = chi_2( map_chi(dfc) + k)
chi_3_e(dfc) = chi_3( map_chi(dfc) + k)
end do
do df = 1, ndf_w3
coord(:) = 0.0_r_def
do dfc = 1, ndf_chi
coord(1) = coord(1) + chi_1_e(dfc)*chi_basis(1,dfc,df)
coord(2) = coord(2) + chi_2_e(dfc)*chi_basis(1,dfc,df)
coord(3) = coord(3) + chi_3_e(dfc)*chi_basis(1,dfc,df)
end do
call xyz2llr(coord(1), coord(2), coord(3), lon, lat, radius)
with use coord_transform_mod, only : xyz2llr
at the module level.
Currently, validate
only checks the symbol table of the kernel subroutine and thus misses the import.
Validation is now working OK and we can compile and run Gung Ho :-) However, we then of course have a lot of white space because of the kernels we can't transform because of this issue. This issue itself has two different strands: the first where we know that another Routine is being called from a Kernel and the second where we have an imported Symbol that then turns out to be a Function call. We can't currently deal with the latter because of #2416.
As is so often the way, it turns out that, in practice, we can't currently deal with the first strand either because the important cases all correspond to polymorphic routines - i.e. the routines are implemented for different precisions and are put behind an interface.
I'm implementing basic support for a new 'GenericInterfaceSymbol' in #2422 and want to check that what I've done will actually be useful in practise.
We apply ACCRoutineTrans
to every kernel that appears in an invoke
to which we are adding OpenACC directives. This results in a modified (and renamed) version of the kernel being written to file. The LFRIc build system then discovers this kernel, builds it and links it in. The next step is to identify Calls and, rather than raise an Exception, transform the target routines in exactly the same way.
I've realised there's a problem with my proposed solution above - we end up transforming source files other than the immediate PSy-layer and associated Kernel source. We currently have no way of managing that cleanly. e.g. We would have to create a new, renamed source file containing a renamed module and the Kernel source would have to be updated to use that new module.
CodedKern
has the rename_and_write()
method which does this renaming via a private _rename_psyir()
method. This is already fairly general but won't cope with routines accessed via generic interfaces. Also, it is obviously currently part of CodedKern
and therefore isn't available in language-level PSyIR which is what a Kernel body (and any called routines) use.
This problem is already the subject of #1013.
We can't (yet) use KernelModuleInlineTrans
as this doesn't work for language-level PSyIR (this is the subject of #924 and #2413).
When transforming kernels for e.g. OpenACC, we currently only transform a single kernel at a time. However, it is possible for a kernel to call other routines (accessed via module
use
statements) and, in the current implementation, these other routines will not be transformed (and thus won't be marked-up for OpenACC compilation). In this issue we will identify such routines and ensure that they are also transformed as necessary.