NVIDIA / cuda-quantum

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
https://nvidia.github.io/cuda-quantum/
Other
468 stars 170 forks source link

Compiler should produce an error if it is being asked to do invalid casts on constant values #1959

Open bmhowe23 opened 1 month ago

bmhowe23 commented 1 month ago

Required prerequisites

Describe the feature

Similar to how CUDA-Q does range checks on indexing into vectors (originally requested in #9 and implemented in #688), it would be helpful to have the compiler produce errors if they do something known to be invalid like this:

func.func @invalid_cast() -> i32 {
  %0 = arith.constant -7.11 : f64
  %1 = cc.cast unsigned %0 : (f64) -> i32
  return %1 : i32
}

This is undefined behavior, presumably because the compiler cannot know the values of variables being cast during runtime, but existing C++ compilers throw errors under this condition if using constexpr, which is similar to our synthesis environment when we fold some arguments into embedded constants. E.g., the following code:

int main(int argc, char *argv[]) {
  constexpr double x = -7.11;
  constexpr unsigned y = static_cast<unsigned>(x);
  printf("x = %f   y = %u\n", x, y);
  return 0;
}

produces errors in both g++ and clang++, like this:

$ g++ test.cpp
test.cpp: In function ‘int main(int, char**)’:
test.cpp:13:49: error: overflow in constant expression [-fpermissive]
   13 |   constexpr unsigned y = static_cast<unsigned>(x);
      |                                                 ^
test.cpp:13:49: error: overflow in constant expression [-fpermissive]
$ clang++ test.cpp
test.cpp:13:22: error: constexpr variable 'y' must be initialized by a constant expression
  constexpr unsigned y = static_cast<unsigned>(x);
                     ^   ~~~~~~~~~~~~~~~~~~~~~~~~
test.cpp:13:48: note: value -7.11 is outside the range of representable values of type 'unsigned int'
  constexpr unsigned y = static_cast<unsigned>(x);
                                               ^
1 error generated.
schweitzpgi commented 1 month ago

Undefined behavior

I think the argument stated above is misleading. The quake code shown is not the same as the C++ code and did not come from the C++ example. The quake code must have come from a general conversion of any floating point value, which is well-defined but not for all possible values.

  __qpu__ void f(...) {
     ....
     double d = ...;
     unsigned u = d; // this is standard conforming
     ...
  }

Of importance, the quake case as written above would happen in the middle of running a program when the kernel is specialized with specific argument values—"quake synthesis". The clang++ frontend will handle this case the same way as the C++ code, if given the C++ code.

% nvq++ -c tmp/h.cpp
tmp/h.cpp:5:22: error: constexpr variable 'y' must be initialized by a constant expression
  constexpr unsigned y = static_cast<unsigned>(x);
                     ^   ~~~~~~~~~~~~~~~~~~~~~~~~
tmp/h.cpp:5:48: note: value -7.11 is outside the range of representable values of type 'unsigned int'
  constexpr unsigned y = static_cast<unsigned>(x);
                                               ^
1 error generated.

Notice, the C++ code is using constexpr, which is the programmer telling the compiler that the expression should be evaluated at compile-time. So nvq++, being clang++ under the covers, generates the same error and the quake code is never generated.

Adding an error message, to catch this corner case in quake would mean that a program that runs in library mode without a peep, compiles with the AOT compiler, and executes without any issues will generate an error and crash if a data set happens to try to convert a negative floating-point value to an unsigned integer. But only this one corner case.

Further, there are plenty of cases of undefined behavior in C++ to consider, and it lends itself to a slippery slope. Possible undefined behavior cases for conversions of floating point to integral would include all of the following considerations. What about the ~17 million values of NaNs? Or +/-Inf? What about -0? What about denormalized values? What about value truncation? As this conversion will be happening on run-time values as explained above, shouldn't we check all these cases?

C++ has plenty of other conversions. What about their undefined behavior? What about loss of precision? What about 2's complement wraparound? What about floating-point overflow and underflow? What about checking alignments?

As long as we're checking undefined behavior at run-time during a kernel call, should we check that floating point value inputs are all in the represented range for an IEEE value and not NaNs, etc.? What about floating point logic? Should we enable floating point exceptions? Should we have a floating-point simulation to verify that all computations stay within representable ranges? What is the acceptable precision when folding floating-point, do we account for the target QPU's tolerances? What about differences between processors? Computing floating point values on the host may yield different results than on a QPU or GPU coprocessor, is that ok?