jrprice / Oclgrind

An OpenCL device simulator and debugger
Other
346 stars 61 forks source link

PyOpenCL scan with struct values gives wrong results #134

Closed mattwala closed 7 years ago

mattwala commented 7 years ago

See this gist for the problem. In essence, it's trying to do a scan over an array of structs, each of which has 3 elements. The output is different from what I would expect but Oclgrind doesn't report any error message.

This is a simplified example of what happens when you try to run Oclgrind on boxtree.

cc: @inducer

jrprice commented 7 years ago

I've reproduced the incorrect results, but when I run with Oclgrind and use the --data-races flag it reports several read-write race conditions, which could possibly be the cause of the errors.

I'll try and investigate further soon.

jrprice commented 7 years ago

If I disable compiler optimisations it seems to work though, so this probably is indeed a bug in either Oclgrind or LLVM.

Does the boxtree code also work if you disable optimisations (--build-options -O0)?

mattwala commented 7 years ago

Thanks for taking a look.

The scan in boxtree works when I disable optimizations (I'm getting other apparently unrelated errors now, but I'm still sorting out what the cause is.)

jrprice commented 7 years ago

OK, this is an LLVM bug. I've submitted a patch that fixes it, but it might take a little while to land. I've checked that your test case produces the correct results with Oclgrind built against LLVM trunk with that patch applied.

I'll keep this bug open until that LLVM patch lands. I've also added a test case that exposes this bug to Oclgrind's test suite, marked as XFAIL for now.

jrprice commented 7 years ago

The LLVM fix has now landed, so if you build Oclgrind against LLVM trunk you should see the correct behaviour for your test case. If you have any further problems please open a new issue.

inducer commented 7 years ago

Thanks for your help on this!