chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.76k stars 414 forks source link

[Bug]: internal error: RES-LOW-ORS-1637 chpl version 2.0.0, foreach and task-private variable in with-statement #24764

Closed etscheelk closed 3 months ago

etscheelk commented 3 months ago

Summary of Problem

Description:

Variable creation in foreach with (var y = 1) leads to compiler error. It seems to be specifically with variable creation within the with task intent clause. Other variable inclusions, such as ref, const, const ref, etc. seem to compile correctly.

I first noticed it when rebuilding for GPU and trying it out, but I checked it again rebuilt without GPU and it still occurs. No error on forall. Intentional error on coforall that indicates task-private variables not supported for coforall, begin, cobegin, which makes sense.

The following is the error message I received at compile-time:

internal error: RES-LOW-ORS-1637 chpl version 2.0.0

Internal errors indicate a bug in the Chapel compiler, and we're sorry for the hassle. We would appreciate your reporting this bug -- please see https://chapel-lang.org/bugs.html for instructions.

Is this a blocking issue with no known work-arounds?

yes, seems so

Steps to Reproduce

Use a foreach loop with a with-statement and a task-private variable.

Source Code:

The minimum required to cause the issue

foreach i in 0..#1 
with (
    var y = 1
)
{

}

Additional more-complete little test

// This is a distillation of something else I was writing
config const len = 100_000;

var x : atomic int = 0;

foreach i in 0..#len 
with (
    ref x,
    var y = 1
)
{
    x.add(y);
}

writeln(x.read());

Compile command:

chpl test.chpl

Execution command:

Not applicable, compilation error.

Associated Future Test(s):

I don't know?

Configuration Information

Ubuntu clang version 15.0.7 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin



- (For Cray systems only) Output of `module list`: N/A
stonea commented 3 months ago

We recently added support for ref, const, and in intents on CPU-bound foreach loops; reduce and var intents remain as future work.

Definitely, having it assert out rather than give a more friendly error message is an oversight and bug on our part, so apologies for that.

Since, you're looking at it, I'll also mention intent support for GPUs is also a work-in-progress. Currently, on a gpu-bound loop const intents should work as you expect, in intents will not, and ref intents will exhibit the "old behavior" we had before introducing foreach intents (basically arrays and objects will be passed by reference to the kernel, scalars will be passed by value). This is something we're hoping to address soon.

e-kayrakli commented 3 months ago

@etscheelk -- I am curious about the "more-complete" reproducer and wanted to mention some other limitations, some due to inherent challenges with GPUs, some because we haven't prioritized them yet.

System-wide atomics are not supported yet. Supporting them is a bit tricky, but vendors have started to roll out some library support in the recent years. We hope to be able to use them in the future. What this means is:

var x: atomic int; // this is on CPU

on here.gpus[0] {
  foreach i in 1..10 with (ref x) {
    x.add(i);  // this is executing on GPU
  }
}

is not supported.

Per-gpu atomics are a bit easier to achieve, but we haven't prioritized them. If your case needs them let us know. What this means is:

on here.gpus[0] {
  var x: atomic int; // this is on GPU, now
  foreach i in 1..10 with (ref x) {
    x.add(i);  // this is executing on GPU
  }
}

is not supported either.

What we support, however, is more "conventional" means of doing atomics on the GPU:

on here.gpus[0] {
  var Arr: [1..n] int;  // regular, non-atomic ints, allocated on GPU memory
  foreach i in 1..10 {
    gpuAtomicAdd(Arr[foo(i)], i);
  }
}

this could allow you to do things like histogramming and random-access atomics.


If you have a particular direction that may require any of those idioms let us know. Chapel has a lot of parallel programming features that we want to support on GPUs as well. Hearing from users is always helpful for prioritization.

stonea commented 3 months ago

Here's a PR to patch in a slightly better error: https://github.com/chapel-lang/chapel/pull/24769

Of course, longer term our goal is to add actual support for 'var' intents.

etscheelk commented 3 months ago

That's exactly what I figured was happening, that var intent was unimplemented or hypothetical currently. The intention of it could also be a little dubious on the GPU based on block, block size, and however Chapel abstracts this issue.

Originally, I was using a forall and the with-intent was creating a few task-private variables, such as a random stream for each thread, initialized positions local to each thread, and a reference to a global array which could be accessed randomly.

Regarding the atomics, my particular issue requires the random-access atomics as you pointed out, so thanks for the suggestion. Ultimately they're not too necessary, especially on larger dimensional sizes, as there will infrequently be any collisions and if there are it isn't highly problematic.

The plan is roughly a billion points repeatedly transformed to create a fractal and the grid is a density map, perhaps 8192x8192. Got a few things I still need to figure out:

Project is inspired by this code: https://github.com/pcantrell/density-fractals/tree/main/Source, fractals created with transformations of rectangular and polar coordinates.

Sorry to turn this into stack overflow, I'll likely bring these questions and considerations there.

e-kayrakli commented 3 months ago

Sorry to turn this into stack overflow, I'll likely bring these questions and considerations there.

We're happy to help. Our gitter channel is also suitable for more interactive conversations: https://gitter.im/chapel-lang/chapel. But quick answers to your questions in case they can help:

How to visualize this (I'm considering compilation to a library, visualized elsewhere)

You could also look into using a C library and interoperating from Chapel. See C Interoperability

GPU install doesn't recognize a device in the here.gpus array

This one could be a separate issue or a gitter conversation we can help with. Chapel built with CHPL_LOCALE_MODEL=gpu should be able to handle that. Maybe the runtime was built with the default CHPL_LOCALE_MODEL=flat.

Parallel random number generation (should I pre-create a stream with fill? I'm also used to the TRNG C++ library for parallel random number generation) (random numbers on gpu are also a little more complicated matter)

In some test codes that we have, we do fillRandom to prepopulate an array on the host and then copy it to the device. As you alluded to, random number generation on GPU is a complicated matter. An idiomatic way of doing that is something along the lines of:

import Random;
var CpuArr: [1..10] real;

Random.fillRandom(CpuArr);

writeln(CpuArr);

on here.gpus[0] {
  var GpuArr = CpuArr;
  GpuArr += 1; // this is a kernel launch using random data, for example

  writeln(GpuArr);
}
e-kayrakli commented 3 months ago

How to visualize this (I'm considering compilation to a library, visualized elsewhere)

As an addendum, @mppf pointed out that https://github.com/chapel-lang/chapel/blob/main/test/exercises/c-ray/Image.chpl has an implementation for PPM and BMP output from a Chapel array. You might want to check it out.

bradcray commented 3 months ago

@etscheelk : One other potential for reading/writing images: Quite awhile ago, we had a user write an introduction to Chapel through image processing, which you should be able to access here: http://primachvis.com/html/imgproc_chapel.html It mentions reading/writing PNG files via interoperability with C, so that may be something to leverage.

As that page notes, the significant evolution of the language caused the code used at that time to break, but @Guillaume-Helbecque has recently (and generously) undertaken an effort to modernize it in https://github.com/chapel-lang/chapel/pull/24245

etscheelk commented 3 months ago

This one could be a separate issue or a gitter conversation we can help with. Chapel built with CHPL_LOCALE_MODEL=gpu should be able to handle that. Maybe the runtime was built with the default CHPL_LOCALE_MODEL=flat.

I'll take another stab when I'm back home on my desktop, but I'll take a look at gitter if I continue to have similar problems.

image processing, which you should be able to access here: http://primachvis.com/html/imgproc_chapel.htm

PPM and BMP output from a Chapel array

Thanks for the output suggestions!

etscheelk commented 3 months ago

foreach with (var) not implemented yet, but better message prints to user indicating it is not implemented yet #24769

e-kayrakli commented 3 months ago

I'll take another stab when I'm back home on my desktop, but I'll take a look at gitter if I continue to have similar problems.

Sounds good, thanks @etscheelk!

Suggestions for image IO keeps coming from my team, so, I'll drop another one here by @mstrout. This one's with PNG: https://github.com/mstrout/ChapelForPythonProgrammersMay2023/tree/main/image_analysis_example


Also, ChapelCon could be a good opportunity to share your work (deadline is next week Friday) or learn more about Chapel and interact with the community. In case you missed it: https://chapel-lang.org/ChapelCon24.html