Scalarization transformation - Githubissues

stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran

BSD 3-Clause "New" or "Revised" License

107 stars 28 forks source link

Scalarization transformation #2499

Open LonelyCat124 opened 9 months ago

LonelyCat124 commented 9 months ago

In parts of the physics codes for LFRiC we come across loop patterns such as this:

do i = ...
  do l = ...
    temp_in(l) = array(l,i) * array2(l,i)
  end do
  call exp_v(n, temp, temp_in)
  do l = ...
   !do some based on temp(l)
  end do
end do

Once we inline and fuse this loop structure, we get loops like this:

do i = ...
  do l = ...
    temp_in(l) = array(l,i) * array2(l,i)
    temp(l) = exp(temp_in(l))
   !do some based on temp(l)
  end do
end do

For cases such as this, temp_in and temp can be scalarised providing that nothing outside the loop depends on their values (which would already be a strange implementation choice, since it would only be for the final value of i). This would help us remove some false dependencies, as there is a write-write dependency on temp(l) if we use collapse on this loop, however these are not necessary since temp can just be a local scalar instead.

The goal of this transformation would be to take code like the above (post all the other inline and loop fusion transformations) and generate:

do i = ...
  do l = ...
   temp_in_scalar = array(l,i) * arary2(l,i)
   temp_scalar = exp(temp_in_scalar)
   !do something based on temp_scalar
  end do
end do

At this point, we can apply target + loop with collapse which will lead to less kernel launches and synchronization, and probably better performance on GPU.

LonelyCat124 commented 6 months ago

First step - on Reference: Find next (and previous?) Reference to this symbol.

hiker commented 6 months ago

First step - on Reference: Find next (and previous?) Reference to this symbol.

The VariableAccess information already contains all accesses in order.

LonelyCat124 commented 6 months ago

First step - on Reference: Find next (and previous?) Reference to this symbol.

The VariableAccess information already contains all accesses in order.

Yeah - the plan is to use that data with this transformation.

LonelyCat124 commented 6 months ago

@hiker I'm a bit confused about how to use the VariablesAccessInfo - is there a linkage between VariablesAccessInfo and the node for a given read/write access? E.g. For a given routine if I wanted to find (in order) all the accesses/dependencies on a given symbol (slash signature) can I do that with the VariablesAccessInfo? I can find the sequence of reads/writes but if I wanted to refer back to the relevant Reference is that currently possible?

Ah I guess its .node in AccesInfo?

hiker commented 6 months ago

@hiker I'm a bit confused about how to use the VariablesAccessInfo - is there a linkage between VariablesAccessInfo and the node for a given read/write access? E.g. For a given routine if I wanted to find (in order) all the accesses/dependencies on a given symbol (slash signature) can I do that with the VariablesAccessInfo? I can find the sequence of reads/writes but if I wanted to refer back to the relevant Reference is that currently possible?

Ah I guess its .node in AccesInfo?

Yes :) I saw the comments in the wrong order, and commented elsewhere :)

LonelyCat124 commented 6 months ago

@sergisiso If the next access to an array reference (that is otherwise a potential target for scalarization) - if its contained within an IfBlock that isn't also an ancestor of the Loop I'm "scalarizing" I will just ignore it rather than dealing with the if condition - unless you think we should specifically try to handle if blocks here?

LonelyCat124 commented 6 months ago

Also I realise that I probably need to be careful with next_access, since I think next_access for something like:

a(i) = a(i) + 1

will point to the LHS of the assignment, so I should also check the RHS of the assignment in this case for scalarization.