stfc / PSycloneBench

Various benchmarks used to inform PSyclone optimisations
BSD 3-Clause "New" or "Revised" License
6 stars 5 forks source link

NemoLite2D OpenCL differences between Fortran and C versions #61

Open sergisiso opened 3 years ago

sergisiso commented 3 years ago

The NemoLite2D Fortran OpenCL manual implementation sometimes produce 0 checksum values. (this may be related to the invalid memory accesses due to sometimes accessing out of boundary values)

Also the OpenCL device is different, this has been observed in POCL and the AMD GPUs.

sergisiso commented 3 years ago

When the checksum is 0 it could be because it is using OpenCL 1.2 and the global_sizse is not a divisible exactly by the number of work sizes, which is a requirement no longer necessary in OpenCL > 2.0 .

Since OpenCL 1.2 is quite old and the issue can be easily resolve by executing the application with DL_ESM_ALIGNMENT=X to add enough elements to match the requirement I propose to leave it as it is. But it would be good to mention the issue/solution in the README of the relevant folder.