Closed lxwithgod closed 4 years ago
Does it work for you if you apply the change from https://github.com/hfinkel/llvm-project-cxxjit/issues/13? What version of CUDA, etc. are you using?
I try it from #13,I use it on cuda 9.2 from your paper,and cuda 10.0.It can't work. this my code:
template<int size>
__global__ void simpleKernel(float* out) {
int idx = threadIdx.x + blockIdx.x * blockDim.x;
if(idx<size){
out[idx]=1.0f;
}
}
template<int size>
[[clang::jit]] void jit_kernel(float* out){
simpleKernel<size><<<1,1024>>>(out);
}
int main(){
float* out;
cudaMalloc((void**)&out,sizeof(float)*1024);
jit_kernel<10>(out);
return 0;
}
I try it #13 or don‘t use it. I try it master/clangjit-9.0. compile it is ok,but my jit_kernel isn't success. mybe I forget some steps.please help me check it
I use it on cuda 9.2 from your paper
In the paper, I was testing on a POWER8+NVIDIA system. Are you using an x86_64 host? Maybe there's some difference.
yes.I have not POWER8. I' am using an x86_64 host.redhat and ubuntu can't success.but,I can use c++ jit, cuda_jit is not ok.
hi @hfinkel can't you reproduce the problem?thanks
hi @hfinkel can't you reproduce the problem?thanks
Yes. This should now be fixed. There was a bug where the PTX generation would not occur for the first device configuration for which you were compiling.
Please reopen if this still doesn't work for you.
thanks,I success this
hi, I try this project,I can't sucess on cuda jit.I use nvprof,I see cudaluanch/cudaSetArgument/....,there are not my kernel. @hfinkel