ecmwf-ifs / loki

Freely programmable source-to-source translation for Fortran
https://sites.ecmwf.int/docs/loki/
Apache License 2.0
29 stars 12 forks source link

Minimal padding in pool allocator #298

Open awnawab opened 5 months ago

awnawab commented 5 months ago

Currently, to protect from misaligned addresses on device, every allocation is padded to 8 bytes in the pool allocator. This would mitigate a lot of the device memory bandwidth benefits from running in single precision. A potential fix could be to only pad those allocations that are not multiples of nproma, as these are for all intents and purposes guaranteed to be multiples of 8.