clang-omp / clang

clang with OpenMP 3.1 and some elements of OpenMP 4.0 support
clang-omp.github.com
Other
91 stars 15 forks source link

Crash without explicit private() clause #56

Closed tycho closed 9 years ago

tycho commented 9 years ago

Just built my n-body test (https://github.com/tycho/nbody) using the latest clang-omp git trees:

clang version 3.5.0 (https://github.com/clang-omp/clang 7e2e79bca762cd30fc454daa716483fdd7dac61a) (https://github.com/clang-omp/llvm e45b045553e027cbe400cbb8ac8c264abbbfaf83)

And I've discovered it has some bad code generation with two of the n-body implementations. I was able to avoid the segmentation faults by doing this change: https://github.com/tycho/nbody/commit/44f8e9fa0df795ffa4dd12445c8efc23b6bad89f

It appears to me that it's not automatically discovering that both variables should already be private without that clause (since iTile will be constant throughout the parallel body, and jTile is the per-thread variable).

alexey-bataev commented 9 years ago

Hi Steven, I don't think your original code is correct. iTile variable is shared inside 'omp parallel for' region by default, if you need to make it private you must mark it private explicitly. You don't need to mark jTile as private, it will be privatized automatically.

Best regards,

Alexey Bataev

Software Engineer Intel Compiler Team Intel Corp.

11.12.2014 7:28, Steven Noonan пишет:

Just built my n-body test (https://github.com/tycho/nbody) using the latest clang-omp git trees:

clang version 3.5.0 (https://github.com/clang-omp/clang 7e2e79b https://github.com/clang-omp/clang/commit/7e2e79bca762cd30fc454daa716483fdd7dac61a) (https://github.com/clang-omp/llvm e45b045553e027cbe400cbb8ac8c264abbbfaf83)

And I've discovered it has some bad code generation with two of the n-body implementations. I was able to avoid the segmentation faults by doing this change: tycho/nbody@44f8e9f https://github.com/tycho/nbody/commit/44f8e9fa0df795ffa4dd12445c8efc23b6bad89f

It appears to me that it's not automatically discovering that both variables should already be private without that clause (since iTile will be constant throughout the parallel body, and jTile is the per-thread variable).

— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56.

tycho commented 9 years ago

Are you certain? From my understanding there's an implicit barrier at the end of the inner for loop, so the iTile variable should remain constant throughout the parallel section. Am I wrong?

tycho commented 9 years ago

Ah, I think there's a misunderstanding -- the iTile should be OK to be shared, because it should be constant through the parallel section. But without marking it private I get a crash.

alexey-bataev commented 9 years ago

Hmm, does your code work without OpenMP at all? Because when you're marking some variable as private, there is a private copy with the default initialization is generated. So, when you marks iTile as private, you're creating private iTile with undefined value, which is used inside the OpenMP region. Try to compile and execute you're code without OpenMP.

Best regards,

Alexey Bataev

Software Engineer Intel Compiler Team Intel Corp.

11.12.2014 7:54, Steven Noonan пишет:

Ah, I think there's a misunderstanding -- the iTile should be OK to be shared, because it should be constant through the parallel section. But without marking it private I get a crash.

— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66569063.

tycho commented 9 years ago

Everything works when built without OpenMP or built with OpenMP and GCC. But under Clang it's breaking.

tycho commented 9 years ago

Also, based on the relative error calculations you're definitely right, the private clause was the wrong workaround, it's definitely breaking the calculations.

tycho commented 9 years ago

If you try the 'master' branch out you'll see the crash I'm talking about -- that code functions correctly on clang/gcc without OpenMP and with GCC+OpenMP, but not with Clang+OpenMP.

alexey-bataev commented 9 years ago

Ok, I'll try

Best regards,

Alexey Bataev

Software Engineer Intel Compiler Team Intel Corp.

11.12.2014 8:10, Steven Noonan пишет:

If you try the 'master' branch out you'll see the crash I'm talking about -- that code functions correctly on clang/gcc without OpenMP and with GCC+OpenMP, but not with Clang+OpenMP.

— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66569962.

tycho commented 9 years ago

I should have given some instructions on what to run with. This will cycle through the algorithms and you'll get crashes on the SOA_tiled and AOS_tiled ones:

env KMP_AFFINITY=scatter,verbose ./nbody --bodies 16 --cycle-after 5 --iterations 1

alexey-bataev commented 9 years ago

Hi, I fixed a bug, you can try once again.

Best regards,

Alexey Bataev

Software Engineer Intel Compiler Team Intel Corp.

11.12.2014 8:15, Steven Noonan пишет:

I should have given some instructions on what to run with. This will cycle through the algorithms and you'll get crashes on the SOA_tiled and AOS_tiled ones:

env KMP_AFFINITY=scatter,verbose ./nbody --bodies 16 --cycle-after 5 --iterations 1

— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66570328.

tycho commented 9 years ago

Tested it, works great.

I'm looking at the commit you added. The lack of comments make it a bit hard to interpret what's happening (yes, I understand it's prototype code and will be cleaner once it's all in mainline clang)... What was going wrong?

alexey-bataev commented 9 years ago

There was a problem with an unsigned loop control variables. In some rare cases we incorrectly calculated number of iterations for loops with such variables and it may cause an overflow in loop counter.

Best regards,

Alexey Bataev

Software Engineer Intel Compiler Team Intel Corp.

11.12.2014 23:49, Steven Noonan пишет:

Tested it, works great.

I'm looking at the commit you added. The lack of comments make it a bit hard to interpret what's happening (yes, I understand it's prototype code and will be cleaner once it's all in mainline clang)... What was going wrong?

— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66686167.

tycho commented 9 years ago

Awesome, thanks for the fix. Looking forward to seeing this OpenMP stuff hit mainline.