chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.77k stars 417 forks source link

Some complex math functions don't compile with GPU support enabled #25610

Open e-kayrakli opened 1 month ago

e-kayrakli commented 1 month ago

Reported in https://chapel.discourse.group/t/complex-sine-broken-with-gpu-enabled-chapel/35562

GPU kernels don't support complex today, but the issue has nothing to do with GPU kernels, where:

use Math only sin;        
writeln(sin(0.0+1.0i));

fails compilation with the failure CHPL_HOME/modules/internal/ChapelStandard.chpl:24: error: Could not find C function for csin; perhaps it is missing or is a macro? with NVIDIA.

I can't tell exactly why csin is missing. CUDA has complex headers which must provide csin. I can reproduce the issue by:

chpl foo.chpl --print-commands
# record the last `clang` call
clang <args> foo.c  # foo.c just calls `csin`

Using the same args as chpl uses for clang, you'll get the same issue. Drop -x cuda, and it compiles just fine. I don't see any noticeable difference in -v output from clang. I feel like some #ifdefs get thrown off either in Chapel runtime or the clang/cuda headers. I suspect this is about a missing flag to that clang invocation, but I don't know what that is.

A potential solution is to call builtin versions of the missing complex functions, but that doesn't feel quite satisfying as fixing the compilation.

diff --git a/modules/standard/Math.chpl b/modules/standard/Math.chpl
index 920b56b14b..3a67618681 100644
--- a/modules/standard/Math.chpl
+++ b/modules/standard/Math.chpl
@@ -1215,8 +1215,8 @@ module Math {
   inline proc sin(x: complex(128)): complex(128) {
     pragma "fn synchronization free"
     pragma "codegen for CPU and GPU"
-    extern proc csin(z: complex(128)): complex(128);
-    return csin(x);
+    extern proc chpl_csin(z: complex(128)): complex(128);
+    return chpl_csin(x);
   }

   /*
diff --git a/runtime/include/chplmath.h b/runtime/include/chplmath.h
index 93be0bb0dd..8945638205 100644
--- a/runtime/include/chplmath.h
+++ b/runtime/include/chplmath.h
@@ -53,6 +53,8 @@ MAYBE_GPU static inline float  chpl_sqrt32(float x)  { return sqrtf(x); }
 MAYBE_GPU static inline double chpl_fabs64(double x) { return fabs(x);  }
 MAYBE_GPU static inline float  chpl_fabs32(float x)  { return fabsf(x); }

+MAYBE_GPU static inline _complex128 chpl_csin(_complex128 x) { return __builtin_csin(x); }
+
 // 32-bit Bessel functions aren't available on all platforms. For cases where
 // we know they're available use them since they should be faster, but in other
 // cases default to using the 64-bit versions and casting.

makes the snippet above compile successfully.

jabraham17 commented 1 month ago

I believe csin is missing because we use C++14 for GPU codegen, but rely on C99 complex number support. In C++14, complex.h may not implement all of the complex math functions, resulting in the error about missing functions.

See https://github.com/Cray/chapel-private/issues/6277 for more details