chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 418 forks source link

@assertOnGpu failed during runtime #24619

Closed xianghao-wang closed 5 months ago

xianghao-wang commented 6 months ago

assertOnGpu failed during runtime

@assertOnGpu failed during runtime on a simple kernel. The problem may relate to newer version of CUDA.

Environment

GPU: RTX 4060 Chapel: chpl 1.33.0 Driver version: 550.54.14 CUDA version: 12.4

Steps to Reproduce

Source Code:

// Test.chpl
proc main() {
    use GPU;
    writeln(here.gpus[0]);

    on here.gpus[0] {
        var arr: [0..#512] real;
        @assertOnGpu forall i in 0..#512 {
            arr[i] = i;
        }

        writeln(arr[12]);
    }
}

Compile, Run and Output

chpl Test.chpl && ./Test will gives the following error message.

LOCALE0-GPU0
Test.chpl:7: error: assertOnGpu() failed

Execution command:

Configuration Information

e-kayrakli commented 6 months ago

Ugh.. we should have done a better job at this. https://github.com/chapel-lang/chapel/pull/24620 will make this setup a configuration error.

@xianghao-wang -- you can't use CHPL_TASKS: fifo with GPU support. See https://chapel-lang.org/docs/main/technotes/gpu.html#known-limitations. We did have an error message for that, but for some reason it was just covering the cpu-as-device mode. I am sorry.

bradcray commented 6 months ago

Tagging on to @e-kayrakli's response. Engin, I think you're saying that the fix that @xianghao-wang should apply here is to set CHPL_TASKS=qthreads, is that right?

@xianghao-wang: Was there a reason you preferred CHPL_TASKS=fifo? Checking the rest of your settings, I'm guessing that you may be using Chapel's "quickstart" configuration rather than the preferred configuration, is that the case? (If so, then Engin, we may want to put something into the GPU tech note suggesting users should be sure that they've enabled the preferred configuration before starting to use GPUs?)

e-kayrakli commented 6 months ago

Tagging on to @e-kayrakli's response. Engin, I think you're saying that the fix that @xianghao-wang should apply here is to set CHPL_TASKS=qthreads, is that right?

Oh, yes, sorry that I left it out. Or, you can simply left it unset. It'll default to qthreads.

If so, then Engin, we may want to put something into the GPU tech note suggesting users should be sure that they've enabled the preferred configuration before starting to use GPUs?

There's a note about CHPL_TASKS=fifo already, but I think you are asking for something more prominent and specific about quickstart vs preferred. My PR https://github.com/chapel-lang/chapel/pull/24621 will update the technote. I just added a statement like this that comes up fairly early in the note:

image

Is that what you're thinking of?

e-kayrakli commented 6 months ago

@xianghao-wang -- also just so you know, we are midway through the process of preparing the 2.0 release. But today is a bit too late to get the code in. So, in all likelihood, 2.0 will behave the same way as 1.33 w.r.t. this (you have to make sure to set your CHPL_TASKS correctly as the user).

bradcray commented 6 months ago

Is that what you're thinking of?

Yep, that looks great, thanks!

xianghao-wang commented 6 months ago

Tagging on to @e-kayrakli's response. Engin, I think you're saying that the fix that @xianghao-wang should apply here is to set CHPL_TASKS=qthreads, is that right?

@xianghao-wang: Was there a reason you preferred CHPL_TASKS=fifo? Checking the rest of your settings, I'm guessing that you may be using Chapel's "quickstart" configuration rather than the preferred configuration, is that the case? (If so, then Engin, we may want to put something into the GPU tech note suggesting users should be sure that they've enabled the preferred configuration before starting to use GPUs?)

Thanks. I found I'm using the one in quickstart. There is no special reason for using fifo. I change it to util/setchplenv.sh and set CHPL_TASKS=qthreads, and it works.

e-kayrakli commented 6 months ago

Thanks, @xianghao-wang -- this issue will be closed when I merge the PR above.

lydia-duncan commented 5 months ago

@xianghao-wang - you've probably already seen it, but noting that @e-kayrakli resolved this issue with his PR merge. Thanks for reporting the problem, and thanks, Engin for fixing it!