Closed csbnw closed 6 years ago
Can you confirm what did you obtain? an error a warning? or both?
These are runtime warnings using the 2.12.2 release, when running the fft2d
example:
$ /opt/clFFT/2.12.2/share/clFFT/examples/fft2d
Platform found: AMD Accelerated Parallel Processing
Device found on the above platform: gfx900
Performing fft on an two dimensional array of size N0 x N1 : 8 x 8
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
/tmp/AMD_20013_19/t_20013_21.cl:308:49: warning: unknown attribute
'max_constant_size' ignored [-Wunknown-attributes]
void fft_fwd(__constant cb_t *cb __attribute__((max_constant_size(32))), __gl...
^
/tmp/AMD_20013_19/t_20013_21.cl:328:8: warning: assigning to '__global float2 *' from
'const __global float2 *' discards qualifiers
[-Wincompatible-pointer-types-discards-qualifiers]
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
/tmp/AMD_20013_19/t_20013_21.cl:336:50: warning: unknown attribute
'max_constant_size' ignored [-Wunknown-attributes]
void fft_back(__constant cb_t *cb __attribute__((max_constant_size(32))), __g...
^
/tmp/AMD_20013_19/t_20013_21.cl:356:8: warning: assigning to '__global float2 *' from
'const __global float2 *' discards qualifiers
[-Wincompatible-pointer-types-discards-qualifiers]
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
4 warnings generated.
/tmp/AMD_20013_37/t_20013_39.cl:308:49: warning: unknown attribute
'max_constant_size' ignored [-Wunknown-attributes]
void fft_fwd(__constant cb_t *cb __attribute__((max_constant_size(32))), __gl...
^
/tmp/AMD_20013_37/t_20013_39.cl:328:8: warning: assigning to '__global float2 *' from
'const __global float2 *' discards qualifiers
[-Wincompatible-pointer-types-discards-qualifiers]
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
/tmp/AMD_20013_37/t_20013_39.cl:336:50: warning: unknown attribute
'max_constant_size' ignored [-Wunknown-attributes]
void fft_back(__constant cb_t *cb __attribute__((max_constant_size(32))), __g...
^
/tmp/AMD_20013_37/t_20013_39.cl:356:8: warning: assigning to '__global float2 *' from
'const __global float2 *' discards qualifiers
[-Wincompatible-pointer-types-discards-qualifiers]
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
4 warnings generated.
fft result:
(32.000000, 32.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
These warnings are gone after applying my patches:
$ ~/opt/clFFT/2.12.2/share/clFFT/examples/fft2d
Platform found: AMD Accelerated Parallel Processing
Device found on the above platform: gfx900
Performing fft on an two dimensional array of size N0 x N1 : 8 x 8
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
(0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000) (0.500000, 0.500000)
fft result:
(32.000000, 32.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
(0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000) (0.000000, 0.000000)
f33ad58c3827f314f7a143e3f022f5b1afc1bd15 fixes this
Looking at the changes that I made, I spotted an issue:
+ if ((strncmp(nameVendor, "NVIDIA",6)!=0) &&
+ (strncmp(nameVendor, "Advanced Micro Devices, Inc.", 28) != 0 &&
+ strncmp(nameDevice, "gfx900", 6) != 0))
The first &&
, should be ||
:
+ if ((strncmp(nameVendor, "NVIDIA",6)!=0) ||
+ (strncmp(nameVendor, "Advanced Micro Devices, Inc.", 28) != 0 &&
+ strncmp(nameDevice, "gfx900", 6) != 0))
Otherwise, the attribute is not set on NVIDIA GPUs. I never tested the code changes on NVIDIA GPUs, just on the AMD Vega GPU where not setting this attribute in any case was a functional workaround.
I don't see how casting the pointer could introduce the NaNs as that change is just about resolving a compilation warning:
warning: assigning to '__global float2 *' from 'const __global float2 *'
This change is just telling the compiler to forget about the const
.
@bramveenboer
This change is just telling the compiler to forget about the const.
It also casts gbIn
to global float2 pointer, after that wrong offset is applied.
Notice, gbIn
might be one of the following types: __global float*/float2*/double*/double2*
Here is a gtest filter to check:
--gtest_filter="*single.normal_2D_out_of_place_real_to_hermitian_interleaved"
@stmuxa,
I see now that gbIn
can have different types than just float2
, this was not clear to me at the time I made the change. I suggest to just revert the commit to resolve this.
I have reverted the casts
When using clFFT on Fedora 26 using the 2515.0 ROCm driver and a Vega Frontier Edition GPU, I get two runtime errors:
And:
These warnings are easily resolved by minor code changes.