CharacteristicMappingMethod / cmm-turbulence

CMM Turbulence code
GNU General Public License v3.0
1 stars 0 forks source link

Check for modulo and integer divisions in device code and replace them #27

Closed Arcadia197 closed 2 years ago

Arcadia197 commented 2 years ago

Taken from Cudas best practice guide (https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#instruction-optimization - 11.1.1)

Note:Low Priority: Use shift operations to avoid expensive division and modulo calculations.

Integer division and modulo operations are particularly costly and should be avoided or replaced with bitwise operations whenever possible: If n is a power of 2, ( i/n ) is equivalent to ( i≫log2(n) ) and ( i%n ) is equivalent to ( i&(n−1)).

The compiler will perform these conversions if n is literal. (For further information, refer to Performance Guidelines in the CUDA C++ Programming Guide).

Arcadia197 commented 2 years ago

I'm not sure how important this is. Normally we assume NX and NY to be in power of two, but I would like to keep this flexible.

Arcadia197 commented 2 years ago

After quite some time have passed I would say, that this does not fit our program. NX and NY should be completely flexible, so this would basically mean we would have to differ for cases and thats quite some micro-tuning