oneapi-src / Velocity-Bench

Other
44 stars 15 forks source link

cudaSift verification fails when not unrolling #28

Closed victor-eds closed 1 year ago

victor-eds commented 1 year ago

When the pragma unroll directive in LowPassBlockOld is not applied, e.g., if it is commented out, verification for cudaSift fails. This should not be happening, as unrolling's an optimization and should not affect program semantics.

Using LowPassBlock with no unrolling is working fine.

Possible solutions

Unless considerable performance improvements yield from using LowPassBlockOld, switch to LowPassBlock (here).

Environment

icpx: 2023.2.0.20230721 Device name: Intel(R) UHD Graphics 630 intel-level-zero-gpu: 1.3.26690.36-704 intel-igc: 1.0.14062.11 OS: Ubuntu 22.04.3 LTS Linux Kernel: 6.2.0-32

skambapugithub commented 1 year ago

Currently reproducing the issue on our end. Will come back with any further updates. Thanks @victor-eds

skambapugithub commented 1 year ago

Used LowPassBlock function instead of LowPassBlockOld to fix the issue. Closing this issue. Thanks .