In v0.2.1 and v0.2.2, the dummy arguments HST,DEV in the COPY_DIM_CONTIGUOUS subroutine had the pointer attribute. This was accidentally removed (apologies!) in the CUDA PR.
Without the POINTER attribute, the runtime reshapes the dummy args so that the LBOUNDS are always 1, rather than the true LBOUNDS of SELF%PTR and SELF%DEVPTR. This doesn't cause any problems in most cases, which explains why this went unnoticed. The only case it fails for is when CHILD%DEVPTR is a discontiguous slice of GANG%DEVPTR and we try and do per-child copies.
Even though we don't plan on introducing per-child copies to the master branch in the near future, I still think this change should be reverted because:
It was introduced by accident
WIthout it, the explicit loops from LBOUNDS(DEV) to UBOUNDS(DEV) are a bit pointless
In v0.2.1 and v0.2.2, the dummy arguments
HST,DEV
in theCOPY_DIM_CONTIGUOUS
subroutine had the pointer attribute. This was accidentally removed (apologies!) in the CUDA PR.Without the
POINTER
attribute, the runtime reshapes the dummy args so that the LBOUNDS are always 1, rather than the true LBOUNDS ofSELF%PTR
andSELF%DEVPTR
. This doesn't cause any problems in most cases, which explains why this went unnoticed. The only case it fails for is whenCHILD%DEVPTR
is a discontiguous slice ofGANG%DEVPTR
and we try and do per-child copies.Even though we don't plan on introducing per-child copies to the master branch in the near future, I still think this change should be reverted because: