Closed chengchingwen closed 6 months ago
I don't see a segfault here, just the expected exception reporting.
The segfault happened if you replace the CUDA.randn(10, 10, 10)
with larger one like CUDA.randn(512, 128, 16)
.
In that case you just get a lot of output on your terminal. I assume you hit CTRL-C, which might have killed Julia or CUDA then.
In any case, there isn't much we can do about this, as I/O is currently handled by CUDA. Maybe we could limit output by keeping track of written bytes and capping it, but that doesn't sound very satisfying.
I didn't hit CTRL-C but wait for the output stop. It still result in segfault. Another issue is that this error is not always captured. On the machine (I reported above) the error is not shown unless I start julia with -g2
.
Another issue is that this error is not always captured. On the machine (I reported above) the error is not shown unless I start julia with
-g2
.
That's intentional. If you run without -g2
you get:
ERROR: a exception was thrown during kernel execution.
Run Julia on debug level 2 for device stack traces.
That's intentional.
Oh, I thought it would give a compile error, but it seems to run successfully and generate the correct result on that machine.
Oh, I thought it would give a compile error, but it seems to run successfully and generate the correct result on that machine.
We can't generate a compile error because of Julia's dynamic semantics. You should still see a run-time exception though, albeit without a stack trace (you need -g2
for that).
You should still see a run-time exception though
It didn't get any exception on that machine. And on another machine, it randomly failed.
Describe the bug
This would crash julia if the array is large, happened on both 1.8.5 and 1.9-beta4.
To reproduce
The Minimal Working Example (MWE) for this bug:
Manifest.toml
``` Paste your Manifest.toml here, or accurately describe which version of CUDA.jl and its dependencies (GPUArrays.jl, GPUCompiler.jl, LLVM.jl) you are using. [052768ef] CUDA v4.0.1 [1af6417a] CUDA_Runtime_Discovery v0.1.1 [0c68f7d7] GPUArrays v8.6.3 [46192b85] GPUArraysCore v0.1.4 [61eb1bfa] GPUCompiler v0.17.2 [929cbde3] LLVM v4.16.0 ⌅ [4ee394cb] CUDA_Driver_jll v0.2.0+0 ⌅ [76a88914] CUDA_Runtime_jll v0.2.3+2 [62b44479] CUDNN_jll v8.6.0+3 ```
Version info
Details on Julia:
Details on CUDA: