Loading module containig printf seems to fail

pieterverstraete commented 10 years ago

I have extended the example that is described in the readme to print out the thread id. The code then looks like this:

#include <stdio.h>

extern "C"   // ensure function name to be exactly "vadd"
{
    __global__ void vadd(const float *a, const float *b, float *c)
    {
        int i = threadIdx.x + blockIdx.x * blockDim.x;
        printf("Thread id %d\n", i);
        c[i] = a[i] + b[i];
    }
}

I then compile this code to ptx using:

nvcc -ptx -arch=sm_20 vadd.cu

The -arch=sm_20 is added to make the printf work. However, now I cannot load the module into Julia anymore. It keeps complaining about an invalid kernel image:

ERROR: 200
 in error at error.jl:22
 in include_from_node1 at loading.jl:120
while loading /var/cache/workdir/pfverstr/test2.jl, in expression starting on line 16

However, the code does work when I execute the same calls to libcuda (cuInit, cuDeviceGet, cuCtxCreate and cuModuleLoad) from a program written in C++.

pieterverstraete commented 10 years ago

It seems like it has something to do with dlopen, because when I use dlopen in cpp code, it also fails.

lindahua commented 10 years ago

Would you please let me know the OS you are using?

pieterverstraete commented 10 years ago

The result of uname -a is: Linux phoenix 3.13-1-amd64 #1 SMP Debian 3.13.5-1 (2014-03-04) x86_64 GNU/Linux

maleadt commented 10 years ago

This is because cuCtxCreate is used, while for more recent API versions cuCtxCreate_v2 should be used. Also see my response at http://stackoverflow.com/questions/22612879/cuda-debug-invalid-kernel-image-error/22634798

cuda.h contains API-conditional code managing this. I guess CUDA.jl should mirror (part of) this, selecting proper API calls based on the requested version?

lindahua commented 10 years ago

I just updated the package, with several bug fixes (e.g. corrected function names and pointer sizes). Please checkout the latest and try again.

JuliaAttic / CUDA.jl

Loading module containig printf seems to fail #2