intel / intel-graphics-compiler

Other
597 stars 155 forks source link

OpenCL program that Builds fine but does not Compile+Link #185

Closed Kerilk closed 3 years ago

Kerilk commented 3 years ago

Hello,

Hopefully this is the correct channel for reporting this "bug". I am doing research in implementing other API and programming languages on top of OpenCL/Level Zero/SPIRV and came across an odd behavior. The following program builds (clBuildProgram) (and works fine) on the latest Intel Compute Runtime, but fails to link (emitting: <origin>: error: Invalid cast (Producer: 'LLVM10.0.0' Reader: 'LLVM 10.0.0')) if it is first compiled (clCompileProgram) then linked (clLinkProgram). I am not sure I would qualify this as a bug, given how borderline what I am doing is, but nonetheless it seemed worth reporting. Here is the offending OpenCL program which aims at bypassing the restriction on images and samplers references being stored in structures:

#define DIM 32

typedef struct {
  intptr_t  image;
  intptr_t  sampler;
} hipTextureObject_st, *hipTextureObject_t;

inline float tex2D_f(hipTextureObject_t textureObject, float x, float y) {
   return read_imagef(__builtin_astype(textureObject->image, read_only image2d_t),
                      __builtin_astype(textureObject->sampler, sampler_t),
                      (float2)(x, y)).x;
}

void reading_impl(hipTextureObject_t textureObject, global float *out) {
  unsigned int i = get_global_id(0);
  unsigned int j = get_global_id(1);
  out[j*DIM+i] = tex2D_f(textureObject, i, j);
}

kernel void readimg(read_only image2d_t image, sampler_t sampler, global float *out) {
  hipTextureObject_st textureObject =
    { __builtin_astype(image, intptr_t),
      __builtin_astype(sampler, intptr_t) };
  reading_impl(&textureObject, out);
}

The offending item here is the sampler (most probably because of the address space). The same example with only the image works fine.

paigeale commented 3 years ago

@Kerilk,

Please provide a more detailed log of the failure to assist in debug. Also please include the exact version of the Intel Compute Runtime that you have downloaded.

It appears you are able to compile the program successfully. One check for certainty would be to use the offline compile tool "ocloc" (should be included in download) to see if there exists any errors in compiling the above kernel.

If you are able to compile this kernel successfully in ocloc I recommend moving this issue to the following github page https://github.com/intel/compute-runtime/issues where the compute runtime team can assist you further.

Thanks, Alex

Kerilk commented 3 years ago

@paigeale thanks for your message,

The version of the Compute Runtime I am using is: 21.26.20194, but I tried older versions as well (21.22.19967). The OpenCL program does compile with ocloc with a Build succeeded. message.

I created a small reproducer that I am putting at the end of this post.

The output of the reproducer is:

clBuildProgram: Build Success. LOG:

clCompileProgram: Build Success. LOG:

<origin>: error: Invalid cast (Producer: 'LLVM10.0.0' Reader: 'LLVM 10.0.0')
clLinkProgram: Build Failure. LOG:
error: Parsing llvm module failed!

I built the reproducer using:

gcc texture.c -o texture -lOpenCL -Wall

texture.c:

#define CL_TARGET_OPENCL_VERSION 200

#include <CL/opencl.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

const char *kernel_source =
"#define DIM 32\n\
\n\
typedef struct {\n\
  intptr_t  image;\n\
  intptr_t  sampler;\n\
} hipTextureObject_st, *hipTextureObject_t;\n\
\n\
inline float tex2D_f(hipTextureObject_t textureObject, float x, float y) {\n\
   return read_imagef(__builtin_astype(textureObject->image, read_only image2d_t),\n\
                      __builtin_astype(textureObject->sampler, sampler_t),\n\
                      (float2)(x, y)).x;\n\
}\n\
\n\
void reading_impl(hipTextureObject_t textureObject, global float *out) {\n\
  unsigned int i = get_global_id(0);\n\
  unsigned int j = get_global_id(1);\n\
  out[j*DIM+i] = tex2D_f(textureObject, i, j);\n\
}\n\
\n\
kernel void readimg(read_only image2d_t image, sampler_t sampler, global float *out) {\n\
  hipTextureObject_st textureObject =\n\
    { __builtin_astype(image, intptr_t),\n\
      __builtin_astype(sampler, intptr_t) };\n\
  reading_impl(&textureObject, out);\n\
}";

void print_program_log(cl_program program, cl_device_id device, const char *type) {
  cl_int           result;
  size_t           sz;
  cl_build_status  status;
  char            *log;

  result = clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_STATUS, sizeof(status), &status, NULL);
  assert( CL_SUCCESS == result );
  assert( CL_BUILD_SUCCESS == status || CL_BUILD_ERROR == status );
  if (CL_BUILD_SUCCESS == status)
    printf("%s: Build Success. LOG:\n", type);
  else
    printf("%s: Build Failure. LOG:\n", type);

  result = clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG, 0, NULL, &sz);
  assert( CL_SUCCESS == result );
  log = (char *)malloc(sz);
  assert( log );
  result = clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG, sz, log, NULL);
  assert( CL_SUCCESS == result );

  printf("%s\n", log);
  free(log);
}

int main() {
  cl_platform_id platform;
  cl_device_id   device;
  cl_context     context;
  cl_program     program1, program2;
  cl_int         result;

  result = clGetPlatformIDs(1, &platform, NULL);
  assert( CL_SUCCESS == result );
  result = clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device,  NULL);
  assert( CL_SUCCESS == result );
  context = clCreateContext(NULL, 1, &device, NULL, NULL, &result);
  assert( CL_SUCCESS == result );
  program1 = clCreateProgramWithSource(context, 1, &kernel_source, NULL, &result);
  assert( CL_SUCCESS == result );
  result = clBuildProgram(program1, 0, NULL, "-cl-std=CL2.0", NULL, NULL);
  assert( CL_SUCCESS == result || CL_BUILD_PROGRAM_FAILURE == result );
  print_program_log(program1, device, "clBuildProgram");
  clReleaseProgram(program1);

  program1 = clCreateProgramWithSource(context, 1, &kernel_source, NULL, &result);
  assert( CL_SUCCESS == result );
  result = clCompileProgram(program1, 0, NULL, "-cl-std=CL2.0", 0, NULL, NULL, NULL, NULL);
  assert( CL_SUCCESS == result || CL_COMPILE_PROGRAM_FAILURE == result );
  print_program_log(program1, device, "clCompileProgram");

  program2 =  clLinkProgram(context, 0, NULL, "-cl-std=CL2.0", 1, &program1, NULL, NULL, &result);
  assert( program2 );
  assert( CL_SUCCESS == result || CL_LINK_PROGRAM_FAILURE == result);
  print_program_log(program2, device, "clLinkProgram");
  clReleaseProgram(program2);
  clReleaseProgram(program1);
  clReleaseContext(context);

  return 0;
}

edit: please note that: <origin>: error: Invalid cast (Producer: 'LLVM10.0.0' Reader: 'LLVM 10.0.0') is emitted directly by the compiler and not as part of the clCompileProgram log.

pszymich commented 3 years ago

I see that the linked issue in Compute Runtime has been resolved, so I'm closing this one.

In case this is a separate problem, please reopen, thanks.