hjabird / cvortex

GPU accelerated vortex filament and vortex particle methods
MIT License
9 stars 3 forks source link

GPU kernels return NaN on Nvidia Quadro K620 #8

Closed hjabird closed 5 years ago

hjabird commented 5 years ago

As per title.

Reproduce as: ppos = rand(1000, 3) pvort = rand(1000, 3) particle_induced_velocity(ppos, pvort, ppos, winckelmans_regularisation(), 0.1)

Code works fine with AMD & Intel GPUs/iGPUs.

hjabird commented 5 years ago

Possibly an issue with using the flag -cl-fast-relaxed-math for building OpenCL kernels. This sets -cl-finite-math-only and -cl-unsafe-math-optimizations. NaN/Inf violate the finite maths assumptions, but in practice is not an issue on AMD/Intel GPUs.

On AMD GCN5 / Windows (Vega 56) -cl-unsafe-math-optimizations does not hinder performance compared to -cl-fast-relaxed-math.

hjabird commented 5 years ago

Fixed in commit 5b204179d4639d0c996edf6919f20ffa720a8268.