hjabird / cvortex

GPU accelerated vortex filament and vortex particle methods
MIT License
9 stars 3 forks source link

Optimise OpenCL kernels #7

Open hjabird opened 5 years ago

hjabird commented 5 years ago

Surely the code can be faster?

hjabird commented 5 years ago

OpenCL particle induced velocity:

See commit ae683f3f51ec4d2f70556ae6d08d4a6dc0561543

Approximately 7.5% faster:

Not used:

Verified good:

To do:

NB:

hjabird commented 5 years ago

commit d1a77dcc13d4869931b6ec830c926f492fb796ca

ind_dvort is approximately 27% faster.

As above, there is some overhead in the Julia script used for benchmarking. 3% in worst case.

hjabird commented 5 years ago

commit 6d962b76d15e95af67105c30e920fa670d2d24f1

ind_dvort is 1.5% faster

hjabird commented 5 years ago

commit 4db32f2e79e5eb11dea8f06539bd7f20bdaa5337

Changing to transpose method for dvort calculations allows reusing a cross product. 7% faster.