Closed tycho closed 9 years ago
Hi Steven, I don't think your original code is correct. iTile variable is shared inside 'omp parallel for' region by default, if you need to make it private you must mark it private explicitly. You don't need to mark jTile as private, it will be privatized automatically.
Best regards,
Software Engineer Intel Compiler Team Intel Corp.
11.12.2014 7:28, Steven Noonan пишет:
Just built my n-body test (https://github.com/tycho/nbody) using the latest clang-omp git trees:
clang version 3.5.0 (https://github.com/clang-omp/clang 7e2e79b https://github.com/clang-omp/clang/commit/7e2e79bca762cd30fc454daa716483fdd7dac61a) (https://github.com/clang-omp/llvm e45b045553e027cbe400cbb8ac8c264abbbfaf83)
And I've discovered it has some bad code generation with two of the n-body implementations. I was able to avoid the segmentation faults by doing this change: tycho/nbody@44f8e9f https://github.com/tycho/nbody/commit/44f8e9fa0df795ffa4dd12445c8efc23b6bad89f
It appears to me that it's not automatically discovering that both variables should already be private without that clause (since iTile will be constant throughout the parallel body, and jTile is the per-thread variable).
— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56.
Are you certain? From my understanding there's an implicit barrier at the end of the inner for loop, so the iTile variable should remain constant throughout the parallel section. Am I wrong?
Ah, I think there's a misunderstanding -- the iTile should be OK to be shared, because it should be constant through the parallel section. But without marking it private I get a crash.
Hmm, does your code work without OpenMP at all? Because when you're marking some variable as private, there is a private copy with the default initialization is generated. So, when you marks iTile as private, you're creating private iTile with undefined value, which is used inside the OpenMP region. Try to compile and execute you're code without OpenMP.
Best regards,
Software Engineer Intel Compiler Team Intel Corp.
11.12.2014 7:54, Steven Noonan пишет:
Ah, I think there's a misunderstanding -- the iTile should be OK to be shared, because it should be constant through the parallel section. But without marking it private I get a crash.
— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66569063.
Everything works when built without OpenMP or built with OpenMP and GCC. But under Clang it's breaking.
Also, based on the relative error calculations you're definitely right, the private clause was the wrong workaround, it's definitely breaking the calculations.
If you try the 'master' branch out you'll see the crash I'm talking about -- that code functions correctly on clang/gcc without OpenMP and with GCC+OpenMP, but not with Clang+OpenMP.
Ok, I'll try
Best regards,
Software Engineer Intel Compiler Team Intel Corp.
11.12.2014 8:10, Steven Noonan пишет:
If you try the 'master' branch out you'll see the crash I'm talking about -- that code functions correctly on clang/gcc without OpenMP and with GCC+OpenMP, but not with Clang+OpenMP.
— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66569962.
I should have given some instructions on what to run with. This will cycle through the algorithms and you'll get crashes on the SOA_tiled and AOS_tiled ones:
env KMP_AFFINITY=scatter,verbose ./nbody --bodies 16 --cycle-after 5 --iterations 1
Hi, I fixed a bug, you can try once again.
Best regards,
Software Engineer Intel Compiler Team Intel Corp.
11.12.2014 8:15, Steven Noonan пишет:
I should have given some instructions on what to run with. This will cycle through the algorithms and you'll get crashes on the SOA_tiled and AOS_tiled ones:
env KMP_AFFINITY=scatter,verbose ./nbody --bodies 16 --cycle-after 5 --iterations 1
— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66570328.
Tested it, works great.
I'm looking at the commit you added. The lack of comments make it a bit hard to interpret what's happening (yes, I understand it's prototype code and will be cleaner once it's all in mainline clang)... What was going wrong?
There was a problem with an unsigned loop control variables. In some rare cases we incorrectly calculated number of iterations for loops with such variables and it may cause an overflow in loop counter.
Best regards,
Software Engineer Intel Compiler Team Intel Corp.
11.12.2014 23:49, Steven Noonan пишет:
Tested it, works great.
I'm looking at the commit you added. The lack of comments make it a bit hard to interpret what's happening (yes, I understand it's prototype code and will be cleaner once it's all in mainline clang)... What was going wrong?
— Reply to this email directly or view it on GitHub https://github.com/clang-omp/clang/issues/56#issuecomment-66686167.
Awesome, thanks for the fix. Looking forward to seeing this OpenMP stuff hit mainline.
Just built my n-body test (https://github.com/tycho/nbody) using the latest clang-omp git trees:
clang version 3.5.0 (https://github.com/clang-omp/clang 7e2e79bca762cd30fc454daa716483fdd7dac61a) (https://github.com/clang-omp/llvm e45b045553e027cbe400cbb8ac8c264abbbfaf83)
And I've discovered it has some bad code generation with two of the n-body implementations. I was able to avoid the segmentation faults by doing this change: https://github.com/tycho/nbody/commit/44f8e9fa0df795ffa4dd12445c8efc23b6bad89f
It appears to me that it's not automatically discovering that both variables should already be private without that clause (since iTile will be constant throughout the parallel body, and jTile is the per-thread variable).