Closed bafonso closed 4 years ago
This PTX is invalid, but because it has all the endlines wrapped. So I can't look up line 488... Could you please just run this in the REPL and copy it there? Or run @device code dir="/tmp/test"
and get the PTX file from that folder.
Yes it was a pain to try and get that text given i'm running remotely on a tmux in Juno. I am sorry it's not clear what command you want me to run. @device is not defined and @device_code_dir does not exist either. Is there a way to run it in REPL and pipe to a file? That would be ideal.
@device_code dir=...
Here you go, I can copy and paste a specific file if you prefer. Thanks!
Thanks for the files!
Reduced to:
.version 6.0
.target sm_61
.func (.param .b32 func_retval0) gpu_report_oom(.param .b64 gpu_report_oom_param_0) {
.param .b64 param0;
call.uni gpu_report_oom,(param0);
}
It gets confused by the missing retval; this works:
.func (.param .b32 func_retval0) gpu_report_oom(.param .b64 gpu_report_oom_param_0) {
.param .b64 param0;
.param .b32 retval0;
call.uni (retval0), gpu_report_oom,(param0);
}
This comes from the following invalid LLVM IR:
target triple = "nvptx64-nvidia-cuda"
%a = type opaque
%b = type i64
@0 = constant [108 x i8] c"ERROR: a %s was thrown during kernel execution.\0A Run Julia on debug level 2 for device stack traces.\0A\00"
@1 = constant [64 x i8] c"ERROR: Out of dynamic GPU memory (trying to allocate %i bytes)\0A\00"
declare i32 @d(i8*, i8*)
define i32 @gpu_report_oom(i64) {
e:
alloca %b
bitcast %b* %1 to i8*
call i32 @d(i8* getelementptr ([64 x i8], [64 x i8]* @1, i64 0, i64 0), i8* %2)
ret i32 4
}
define %a addrspace(10)* @f(i64) {
icmp eq i64 1, 2
call void bitcast (i32 (i64)* @gpu_report_oom to void (i64)*)(i64 0)
inttoptr i64 1 to %a*
addrspacecast %a* %3 to %a
addrspace(10)*
ret %a addrspace(10)* %4
}
So bitcasting the report_oom
function to return void
is invalid, and this is itself caused by https://github.com/JuliaGPU/CUDAnative.jl/commit/3cbb021c4289054900742d671b06d748e0b20e4a.
This commit seems to be after 3.1.0 was released and this is failing with 3.1.0. If I downgrade to 3.0.4 then I do not get the error while using Juno remote session. Also, the issue only occurs when using Juno and not when using julia from a terminal. I noticed this because I uninstalled CUDAnative and installed CUDA and noticed it was still using 3.0.4.
This commit seems to be after 3.1.0 was released and this is failing with 3.1.0.
Where do you see this? The commit isn't part of any release:
Oh I see this is GPUCompiler.jl, silly me. I am just confused that you closed the issue. Is this already fixed in master? I'd be happy to test.
It is, see the linked commit, https://github.com/JuliaGPU/CUDAnative.jl/commit/3cbb021c4289054900742d671b06d748e0b20e4a Nowadays development happens in CUDA.jl, but it's fixed there as well.
I get CUDA error: a PTX JIT compilation failed (code 218, ERROR_INVALID_PTX) running simple code in Julia using Flux.
This is what Juno outputs
@device_code_ptx m_gpu(xpto_gpu)
outputs in Juno:This is output using the terminal: