Fix compilation warnings on Vega architecture

csbnw commented 7 years ago

When using clFFT on Fedora 26 using the 2515.0 ROCm driver and a Vega Frontier Edition GPU, I get two runtime errors:

/tmp/AMD_37850_19/t_37850_21.cl:324:49: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
void fft_fwd(__constant cb_t *cb __attribute__((max_constant_size(32))), __global const float2 * restrict gbIn...
                                                ^

And:

/tmp/AMD_37850_19/t_37850_21.cl:344:8: warning: assigning to '__global float2 *' from 'const __global float2 *'
      discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
      lwbIn = gbIn + iOffset;
              ^ ~~~~~~~~~~~~~~

These warnings are easily resolved by minor code changes.

tingxingdong commented 7 years ago

Can you confirm what did you obtain? an error a warning? or both?

csbnw commented 7 years ago

These are runtime warnings using the 2.12.2 release, when running the fft2d example:

$ /opt/clFFT/2.12.2/share/clFFT/examples/fft2d 
Platform found: AMD Accelerated Parallel Processing
Device found on the above platform: gfx900

Performing fft on an two dimensional array of size N0 x N1 : 8 x 8
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
/tmp/AMD_20013_19/t_20013_21.cl:308:49: warning: unknown attribute
      'max_constant_size' ignored [-Wunknown-attributes]
void fft_fwd(__constant cb_t *cb __attribute__((max_constant_size(32))), __gl...
                                                ^
/tmp/AMD_20013_19/t_20013_21.cl:328:8: warning: assigning to '__global float2 *' from
      'const __global float2 *' discards qualifiers
      [-Wincompatible-pointer-types-discards-qualifiers]
        lwbIn = gbIn + iOffset;
              ^ ~~~~~~~~~~~~~~
/tmp/AMD_20013_19/t_20013_21.cl:336:50: warning: unknown attribute
      'max_constant_size' ignored [-Wunknown-attributes]
void fft_back(__constant cb_t *cb __attribute__((max_constant_size(32))), __g...
                                                 ^
/tmp/AMD_20013_19/t_20013_21.cl:356:8: warning: assigning to '__global float2 *' from
      'const __global float2 *' discards qualifiers
      [-Wincompatible-pointer-types-discards-qualifiers]
        lwbIn = gbIn + iOffset;
              ^ ~~~~~~~~~~~~~~
4 warnings generated.
/tmp/AMD_20013_37/t_20013_39.cl:308:49: warning: unknown attribute
      'max_constant_size' ignored [-Wunknown-attributes]
void fft_fwd(__constant cb_t *cb __attribute__((max_constant_size(32))), __gl...
                                                ^
/tmp/AMD_20013_37/t_20013_39.cl:328:8: warning: assigning to '__global float2 *' from
      'const __global float2 *' discards qualifiers
      [-Wincompatible-pointer-types-discards-qualifiers]
        lwbIn = gbIn + iOffset;
              ^ ~~~~~~~~~~~~~~
/tmp/AMD_20013_37/t_20013_39.cl:336:50: warning: unknown attribute
      'max_constant_size' ignored [-Wunknown-attributes]
void fft_back(__constant cb_t *cb __attribute__((max_constant_size(32))), __g...
                                                 ^
/tmp/AMD_20013_37/t_20013_39.cl:356:8: warning: assigning to '__global float2 *' from
      'const __global float2 *' discards qualifiers
      [-Wincompatible-pointer-types-discards-qualifiers]
        lwbIn = gbIn + iOffset;
              ^ ~~~~~~~~~~~~~~
4 warnings generated.

fft result: 
(32.000000, 32.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)

These warnings are gone after applying my patches:

$ ~/opt/clFFT/2.12.2/share/clFFT/examples/fft2d
Platform found: AMD Accelerated Parallel Processing
Device found on the above platform: gfx900

Performing fft on an two dimensional array of size N0 x N1 : 8 x 8
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) 

fft result: 
(32.000000, 32.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) 
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)

bragadeesh commented 6 years ago

f33ad58c3827f314f7a143e3f022f5b1afc1bd15 fixes this

csbnw commented 6 years ago

Looking at the changes that I made, I spotted an issue:

+        if ((strncmp(nameVendor, "NVIDIA",6)!=0) &&
+            (strncmp(nameVendor, "Advanced Micro Devices, Inc.", 28) != 0 &&
+             strncmp(nameDevice, "gfx900", 6) != 0))

The first &&, should be ||:

+        if ((strncmp(nameVendor, "NVIDIA",6)!=0) ||
+            (strncmp(nameVendor, "Advanced Micro Devices, Inc.", 28) != 0 &&
+             strncmp(nameDevice, "gfx900", 6) != 0))

Otherwise, the attribute is not set on NVIDIA GPUs. I never tested the code changes on NVIDIA GPUs, just on the AMD Vega GPU where not setting this attribute in any case was a functional workaround.

I don't see how casting the pointer could introduce the NaNs as that change is just about resolving a compilation warning: warning: assigning to '__global float2 *' from 'const __global float2 *'

This change is just telling the compiler to forget about the const.

stmuxa commented 6 years ago

@bramveenboer

This change is just telling the compiler to forget about the const.

It also casts gbIn to global float2 pointer, after that wrong offset is applied. Notice, gbIn might be one of the following types: __global float*/float2*/double*/double2* Here is a gtest filter to check: --gtest_filter="*single.normal_2D_out_of_place_real_to_hermitian_interleaved"

csbnw commented 6 years ago

@stmuxa, I see now that gbIn can have different types than just float2, this was not clear to me at the time I made the change. I suggest to just revert the commit to resolve this.

bragadeesh commented 6 years ago

I have reverted the casts

clMathLibraries / clFFT

Fix compilation warnings on Vega architecture #207