Closed mattwala closed 7 years ago
I've reproduced the incorrect results, but when I run with Oclgrind and use the --data-races
flag it reports several read-write race conditions, which could possibly be the cause of the errors.
I'll try and investigate further soon.
If I disable compiler optimisations it seems to work though, so this probably is indeed a bug in either Oclgrind or LLVM.
Does the boxtree code also work if you disable optimisations (--build-options -O0
)?
Thanks for taking a look.
The scan in boxtree works when I disable optimizations (I'm getting other apparently unrelated errors now, but I'm still sorting out what the cause is.)
OK, this is an LLVM bug. I've submitted a patch that fixes it, but it might take a little while to land. I've checked that your test case produces the correct results with Oclgrind built against LLVM trunk with that patch applied.
I'll keep this bug open until that LLVM patch lands. I've also added a test case that exposes this bug to Oclgrind's test suite, marked as XFAIL for now.
The LLVM fix has now landed, so if you build Oclgrind against LLVM trunk you should see the correct behaviour for your test case. If you have any further problems please open a new issue.
Thanks for your help on this!
See this gist for the problem. In essence, it's trying to do a scan over an array of structs, each of which has 3 elements. The output is different from what I would expect but Oclgrind doesn't report any error message.
This is a simplified example of what happens when you try to run Oclgrind on boxtree.
cc: @inducer