code-iai / iai_kinect2

Tools for using the Kinect One (Kinect v2) in ROS
Apache License 2.0
881 stars 519 forks source link

Fix compile error in OpenCL shader #513

Open skalldri opened 6 years ago

skalldri commented 6 years ago

On a fresh install of Beignet on Ubuntu 18.04, I get an "ambiguous function" OpenCL compile error when running the kinec2 depth registration node.

This compile error happens because there are multiple available sqrt() functions which take a variety of inputs. Since the type of the input data is not explicit, the compile fails.

The fix is to explicitly mark the constant value as a float so the OpenCL compiler can pick the correct version of the sqrt() function.

skalldri commented 5 years ago

@bbferka are you able to merge this PR?

klokik commented 5 years ago

@skalldri can you update your PR with following changes also? Otherwise I have successfully compiled kernel (with beignet OpenCL), but

ASSERTION FAILED: Double precision not supported on this device (if this is a literal, use '1.0f' not '1.0') at file /build/beignet-Bevceu/beignet-1.3.2/backend/src/backend/gen_insn_selection.cpp, function void gbe::ConvertInstructionPattern::convertDoubleToSmallInts(gbe::Selection::Opaque&, const gbe::ir::ConvertInstruction&, bool&) const, line 6386

@@ -111,7 +111,7 @@ void kernel checkDepth(global const int4 *idx, global const ushort *zImg, global

   const int4 index = idx[i];
   const ushort zI = zImg[i];
-  const ushort thres = 0.01 * zI;
+  const ushort thres = 0.01f * zI;
   const ushort zIThres = zI + thres;
   const float4 dist2 = dists[i];

@@ -176,7 +176,7 @@ void kernel remapDepth(global const ushort *in, global ushort *out, global const
   }

   const float avg = (p.s0 + p.s1 + p.s2 + p.s3) / count;
-  const float thres = 0.01 * avg;
+  const float thres = 0.01f * avg;
   valid = isless(fabs(p - avg), (float4)(thres));
   count = abs(valid.s0 + valid.s1 + valid.s2 + valid.s3);

@@ -192,5 +192,5 @@ void kernel remapDepth(global const ushort *in, global ushort *out, global const
   const float4 dist = select((float4)(0), tmp - sqrt(dist2), valid);
   const float sum = dist.s0 + dist.s1 + dist.s2 + dist.s3;

-  out[i] = (dot(p, dist) / sum) + 0.5;
+  out[i] = (dot(p, dist) / sum) + 0.5f;
 }
skalldri commented 5 years ago

@klokik absolutely. Doing that now.