When the pragma unroll directive in LowPassBlockOld is not applied, e.g., if it is commented out, verification for cudaSift fails. This should not be happening, as unrolling's an optimization and should not affect program semantics.
Using LowPassBlock with no unrolling is working fine.
Possible solutions
Unless considerable performance improvements yield from using LowPassBlockOld, switch to LowPassBlock (here).
When the
pragma unroll
directive inLowPassBlockOld
is not applied, e.g., if it is commented out, verification forcudaSift
fails. This should not be happening, as unrolling's an optimization and should not affect program semantics.Using
LowPassBlock
with no unrolling is working fine.Possible solutions
Unless considerable performance improvements yield from using
LowPassBlockOld
, switch toLowPassBlock
(here).Environment
icpx
:2023.2.0.20230721
Device name: Intel(R) UHD Graphics 630intel-level-zero-gpu
:1.3.26690.36-704
intel-igc
:1.0.14062.11
OS:Ubuntu 22.04.3 LTS
Linux Kernel:6.2.0-32