Closed GOFAI closed 8 years ago
Is it possible to do something like _Pragma("unroll")
here, given that STRINGIFY(...)
is producing an OpenCL kernel?
Just replacing #pragma
with _Pragma()
doesn't seem to expand into #pragma unroll
before STRINGIFY gets to it. I honestly don't know if OpenCL treats _Pragma like C99. Will OpenCL accept something like:
__kernel void diag_dtrtri_lower_128_16_src(
int isDiagUnit,
__global double const * restrict A,
uint offA,
__global double *d_dinvA,
uint lda,
uint na)
{
int i, j;
double Ystx = 0;
__local double *Bw = 0, *x = 0, *y = 0;
double switcher;
double neg_switcher;
int tx = get_local_id(0);
int txw;
int gx = get_global_id(0);
int bx = get_group_id(0);
A = A + offA;
__global const double *Aoff = A + bx*lda*16
+ bx*16;
int NumBLperNB = 128 / 16;
d_dinvA += bx / NumBLperNB*128
*128
+ (bx % NumBLperNB)*(128 *16 + 16);
__local double Bs[16 *16 ];
__local double workspace[16];
_Pragma("unroll")
. . .```
Probably best to check the opencl specs, rather than just trying it. Otherwise it will work on some platforms, and not on others. From opencl 1.1 specs, sectoin 6.9, I only see:
6.9
Preprocessor Directives and Macros
The preprocessing directives defined by the C99 specification are supported.
The # pragma directive is described as:
# pragma pp-tokens opt new-line
A # pragma directive where the preprocessing token OPENCL (used instead of STDC ) does not
immediately follow pragma in the directive (prior to any macro replacement) causes the
implementation to behave in an implementation-defined manner. The behavior might cause
translation to fail or cause the translator or the resulting program to behave in a non-conforming
manner. Any such pragma that is not recognized by the implementation is ignored. If the
preprocessing token OPENCL does immediately follow pragma in the directive (prior to any
macro replacement), then no macro replacement is performed on the directive, and the directive
shall have one of the following forms whose meanings are described elsewhere:
#pragma OPENCL FP_CONTRACT on-off-switch
on-off-switch: one of ON OFF DEFAULT
#pragma OPENCL EXTENSION extensionname : behavior
#pragma OPENCL EXTENSION all : behavior
The following predefined macro names are available.
__FILE__ The presumed name of the current source file (a character string literal).
__LINE__ The presumed line number (within the current source file) of the current
source line (an integer constant).
__OPENCL_VERSION__ substitutes an integer reflecting the version number of the OpenCL
supported by the OpenCL device. The version of OpenCL described in this document will have
__OPENCL_VERSION__ substitute the integer 110.
CL_VERSION_1_0 substitutes the integer 100 reflecting the OpenCL 1.0 version.
CL_VERSION_1_1 substitutes the integer 110 reflecting the OpenCL 1.1 version.
__ENDIAN_LITTLE__ is used to determine if the OpenCL device is a little endian architecture
or a big endian architecture (an integer constant of 1 if device is little endian and is undefined
otherwise). Also refer to CL_DEVICE_ENDIAN_LITTLE specified in table 4.3.
__kernel_exec(X, typen) (and kernel_exec(X, typen)) is defined as
Last Revision Date: 6/1/11
Page 196__kernel __attribute__((work_group_size_hint(X, 1, 1))) \
__attribute__((vec_type_hint(typen)))
__IMAGE_SUPPORT__ is used to determine if the OpenCL device supports images. This is an
integer constant of 1 if images are supported and is undefined otherwise. Also refer to
CL_DEVICE_IMAGE_SUPPORT specified in table 4.3.
__FAST_RELAXED_MATH__ is used to determine if the –cl-fast-relaxed-math optimization
option is specified in build options given to clBuildProgram. This is an integer constant of 1 if
the –cl-fast-relaxed-math build option is specified and is undefined otherwise.
The macro names defined by the C99 specification but not currently supported by OpenCL are
reserved for future use.
Just in case it's useful, what I did in DeepCL, to stringify things, was to use Python, combined with cogapp http://nedbatchelder.com/code/cog/ eg:
https://github.com/hughperkins/DeepCL/blob/master/src/conv/BackpropWeightsNaive.cpp#L52-L138
// [[[cog
// import stringify
// stringify.write_kernel2("kernel", "cl/backpropweights.cl", "backprop_floats", 'options')
// ]]]
// generated using cog, from cl/backpropweights.cl:
const char * kernelSource =
"kernel void doSomething(\n"
...
"\n"
"\n"
"";
kernel = cl->buildKernelFromString(kernelSource, "backprop_floats", options, "cl/backpropweights.cl");
// [[[end]]]
Edit: stringify.py is here: https://github.com/hughperkins/DeepCL/blob/master/cog-batteries/stringify.py
I would certainly prefer to work with Python than try to fight the C++ preprocessor like this. The spec is ambiguous, and Googling doesn't seem to find any example of _Pragma()
in an OpenCL kernel, which suggests we can't count on its availability.
I've done some tinkering and on OSX, at least, _Pragma()
seems to work in OpenCL kernels. Might it be reasonable to define some sort of platform-specific macro that uses _Pragma("unroll")
on OSX and leaves it as #pragma unroll
in other cases where that might be necessary?
I am not sure if you saw pull request #189 is addressing this problem. I think it should work but we need to at least test it on linux and windows.before merging the PR.
One other possible variant, in case #189 is not portable (which it might be, I dont know), could be to replace #pragma
with %PRAGMA
, then to search and replace %PRAGMA
with #pragma
at runtime. There is already precedent for such search and replace in clBLAS, eg for %VTYPE and similar.
PR #189 is merged. Can you verify if this fixed the compile time error for you?
I'm not in an easy position to check this myself--2.8 always compiled fine for me, but it turns out that was because I'm still using XCode 6.4. When I submitted a pull request to homebrew to bump the clBLAS formula from 2.2. to 2.8, their testing discovered the problem with XCode 7.1 that I reported.
Hi Edward, following your example (and basically hacking and pasting your formula :-) ), created a travis/brew build for clblas at https://github.com/hughperkins/homebrew-test-travis-osx/blob/c63c4682eb062d0c8dc43cdec787b6270a7e22e9/Formula/clblas.rb will see how it fairs: https://travis-ci.org/hughperkins/homebrew-test-travis-osx/builds
Looks like it.
k. Is it an appropriate xcode version being used here?
It's definitely using a higher version than the one I have on my MacBook.
Can we close this issue? It seems Hugh was able to verify that it builds for xcode 7.0?
this is broken and not building, look at the latest travis
@gaming-hacker This is most likely fixed in develop branch from the recent PR. Can you test this out ?
@pavanky The issue is resolved in the develop branch of this project.
I think this can be closed now. Seems like that it works now in 'develop' branch is sufficient?
I have no trouble building using XCode 6, but the latest version is more stringent and fails on a few files because "embedding a #pragma directive within macro arguments is not supported."
The macro in question is
STRINGIFY(...)
.