stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
104 stars 27 forks source link

Improved handling of dependence analysis for arrays in OpenMP #2065

Open nmnobre opened 1 year ago

nmnobre commented 1 year ago

I realise this might be hard to solve, but in cases such as:

do ji = 1, npti, 1
  do jk = 1, nlay_i, 1
    ztmelts = -rTmlt * sz_i_1d(ji,jk)
    ztmp(jk) = ztmelts / MIN(ztmelts, t_i_1d(ji,jk) - rt0)
  enddo
  zperm = MAX(0._wp, 3.e-08_wp * MINVAL(ztmp) ** 3)
end do

ztmp prevents parallelisation of the outer loop via a WaW dependence.

However, we can see that ztmp is overwritten exactly over the same range in each iteration of the outer loop and that the only use is after the write in the program's text. So, at most, only the last iteration's writes might matter outside the loop. Furthermore, if I force parallelisation by ignoring the dependence analysis results, ztmp isn't marked private and remains shared. I know I'm asking for trouble here, but do we think this might be supported one day?

hiker commented 1 year ago

I can have a look, but I am rather busy, will be a week or two probably before I can get to that. I also haven't touch the dependency handling code for a few months, so it will take me some time to get back into it.

sergisiso commented 1 year ago

This pattern is found in several more places in NEMO like zdfsh2.f90 and dynvor.f90::vor_een.

sergisiso commented 1 year ago

It needs: