stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
103 stars 25 forks source link

Incomplete variable access information for LFRic kernels, e.g. in LoopFusion #2498

Open hiker opened 5 months ago

hiker commented 5 months ago

The reference_accesses methods for an LFRic kernel adds the arguments to the access list, but (esp. in the case of fields) does not provide any index information. As a result of this, the dependency_analysis will not work properly. For example, in (the new #2488) loop_fusion additional checks for validity will not work with LFRic (and have de-facto been disabled for any PSyLoop, i.e. for any LFRic or gocean kernel):

FAILED psyGen_test.py::test_args_filter - psyclone.psyir.transformations.transformation_error.TransformationError: Transformation Error: Variable 'f1_data' does not depend on loop variable 'cell', but is read and written

Which is basically because the access to f1_data does not have any index information. This would also potentially affect the DependencyAnalysis tools.

I am adding a #TODO to the new loop_fusion for this issue. The test code seems to work with gocean once #2496 is merged, so we either need a way for lfric loop fusion to disable this test, or add index information. The latter would be the preferred option, but it's very hard to given the indirect accesses used in LFRic to handle this properly - the loop index is typically used to select the mapping values to use:

      DO cell=loop0_start,loop0_stop
        CALL summation_w0_to_w3_kernel_code(nlayers, field_3_data..., map_w3(:,cell), ndf_w0, undf_w0, map_w0(:,cell))

And in the kernel:

      do k=0, nlayers-1
        do i=1, ndf_w0
          field_w3(map_w3(1)+k) = field_w3(map_w3(1)+k) + field_w0(map_w0(i)+k)

So even if we would look into the kernel, the dependency of the indices used for (say) field_w3 on cell is not obvious. Maybe we could add artificial accesses (like we do for gocean kernels, though in this case there is no indirect addressing), i.e. add field_w3(cell) just as an artificial access??

arporter commented 5 months ago

I'm probably missing something here but can't we use the kernel metadata - that's what it's there for?

hiker commented 5 months ago

Maybe - that's what we do in gocean: I take the stencil information, and add artificial accesses to the variable access information depending on stencil (e.g. a(i-1, j+1)).

But in LFRic - everything is 1d. How do I add the information that a kernel writes a column, and reads the column to the north? Internally, these are all just different indices in the map used :(

I think we can just ignore (from a D.A. point of view) the map used, which indicates where the elements of column are, and not handle individual finite elements, but handle columns as one element. So instead of specifying: this kernel reads a(map(1:n,cell)) for n columns, we just think of a(cell) (and that already ignores the degrees of freedom :) ). But then how do I express that a kernel uses an element to the left/right/top/bottom?? a(cell-1/+1/-N/+N)?? Esp. since then the validity might depend on the direction in which the cells are looped over, which I am not sure we even know (can we guarantee that cells are always looped in a certain direction)?

Example for 1d:

do i=2, n
    a(i) = a(i-1) + 1

and

do i=n, 2, -1
    a(i) = a(i-1) + 1

The first loop accesses the newly computed a(i-1) from the previous iteration. The second loop uses the old a(i-1).

How does the direction even work with LFRic, do we know if the cells are looped over left to right, or right to left?

Some of the headache is more theoretical I'd guess - these kind of corner cases are probably not used in 'real LFRic' code, only in the nightmare of computer scientists?? :)

It's probably not that bad, since e.g. the current DA does not actually look at loop boundaries (e.g. it can't reason about an access like a(i+n) = a(i) - if we know that the loop is do i=1,n, then we could deduce that the left and right side never overlaps. At this stage the DA does all the maths so that it could compare the distance between the accesses with loop boundaries). Maybe all we need is to distinguish a(i) and a(i+X), and we just assume that anything there will overlap.

So it might not matter at all, but I think we need to have a a good think about this.

hiker commented 5 months ago

Note that LFRic still adds strings and even integers as indices in some places, e.g. 1 and ":".

Otherwise #2496 fixes #845 for gocean, but this ticket is required to completely finish it. Note (to avoid someone else being surprised :) ) that goloop_test.test_independent_iterations starts with an invalid loop (not enough children), so it actually tests error handling :) Removing the try for #845 in independent_iterations in gocean1p0 will raise an exception then :)

arporter commented 5 months ago

In short, the STENCIL and function-space information in the metadata will tell us about horizontal dependencies. Certainly, there's no concept of 'North'/'South' etc.