Copilot-Language / copilot

A stream-based runtime-verification framework for generating hard real-time C code.
http://copilot-language.github.io
633 stars 50 forks source link

C translation doesn't correctly select operations based on types #263

Closed robdockins closed 2 years ago

robdockins commented 2 years ago

See:

https://github.com/Copilot-Language/copilot/blob/11ab89fbb839a99aa021f458567c7d4b7ad0c745/copilot-c99/src/Copilot/Compile/C99/Translate.hs#L56-L106

Code generation selects a function to call based only on the operation, without paying attention the type. This is almost certainly not what is desired. For example abs is a function on int, but is applied also to float and double values. This causes the C compiler to (silently (!)) perform numerical coercions to/from int. Instead the generator should select fabs or fabsf. Likewise choosing between copysign or copysignf, cos or cosf, etc.

The abs case for integer values is actually even tricker, as copilot uses the fixed-width intX_t types, but, as far as I know, the standard library does not provide operations for fixed-width types, but instead only provides abs, labs, and llabs for int, long and long long respectively. The Sign operation is also troublesome, as it applies also to integer operations on the copilot side, but has no corresponding C primitive that I know of. These may have to be provided as macros to get correct behavior.

robdockins commented 2 years ago

clang can be convinced to halt compilation on these kinds of problems by passing the -Wconversion -Werror flags.

robdockins commented 2 years ago

I've also just learned about tgmath.h, which provides type-generic macros for math.h functions. It seems it first appeared in C99. This might make it slightly easier, at least for the floating-point functions.

https://en.cppreference.com/w/c/numeric/tgmath

ivanperez-keera commented 2 years ago

Yes, we have recently talked about this (Alwyn, Frank and I).

This is what I meant today when I said that the meaning of some math operations will change.

ivanperez-keera commented 2 years ago

I've been looking into this issue. My solution spelled out the types a bit more in the case selections for OP{1,2}, in an attempt at preventing future developers from defaulting to a "f" suffix by mistake for functions that may not support it (i.e., it directed the developer to "spell it out").

I've tried to merge your solution and mine, into a simpler, shared version available here: https://github.com/Copilot-Language/copilot/tree/develop-specialize-math-types. I removed some checks on errors on input types, mainly because the GADT would disallow them by construction.

Still, it seems like there's "too much logic" and too much ad-hocness. (The whole C99-backend can be vastly simplified.)

I also went down the path of tgmath.h, and I agree with Rob: if it's well supported, it's likely the simplest and best solution. I know arduino is not our main target, but I'd definitely want to hear @joeyh's thoughts on this.

ivanperez-keera commented 2 years ago

I just tried compiling an arduino example with gcc-avr it does not find a <tgmath.h>. It does not seem to be part of the standard avr-libc anymore.

Note that the variants specialized for floats in math.h in avr are just aliases for the double-based versions of the same functions, so there's just no difference. In that platform, it makes no sense to specialize for different types (except for abs/fabs).

I've sent a message to the avr-libc users mailing list to see if there's a reason for that file not being included.

ivanperez-keera commented 2 years ago

@joeyh for arduino, I found that the following compiles:

Adaptation to Makefile, to include user library:

USER_LIB_PATH := $(realpath libraries)
BOARD_TAG = uno
include /usr/share/arduino/Arduino.mk

New libraries/tgmath/tgmath.h file, which simply imports math.h:

#include <math.h>

Would this be an acceptable adaptation to arduino-copilot?

The following question is for everyone: If the above works with arduino-copilot, are there any reasons not to use tgmath.h? This solution is much cleaner, I think it requires very few changes (see last 3 commits in https://github.com/Copilot-Language/copilot/tree/develop-specialize-math-types-tgmath, or compare branches).

ivanperez-keera commented 2 years ago

Btw, tgmath.h does not provide a macro for abs, so we'd still need to distinguish between (integer) abs and fabs.

ivanperez-keera commented 2 years ago

Ok, after some deliberation with the Copilot team, discussions on the avr mailing list, etc. I'm making an executive decision to merge the change as implemented in the branch above (that is, without tgmath.h). The reason is that tgmath.h relies on specific compiler extensions, its implementation is compiler- (and architecture-) dependent, all of which makes it harder for people to reason about specifically which version of the function will be executed (they know in principle, but tgmath.h hides details away to a point where V&V might get harder).

It's very borderline though, and I could be convinced of the opposite, especially if we can make a good argument in favor of verification. And, more generally, I would be interested in seeing if it's possible to optimize that boilerplate code away in some other clean way.

RyanGlScott commented 2 years ago

Just to clarify, when you say "the branch above", you're referring to https://github.com/Copilot-Language/copilot/tree/develop-specialize-math-types?

ivanperez-keera commented 2 years ago

I am.

Clarification: @robdockins is right that abs, labs, etc. may have to be used for larger integers, but they operate on int, not fixed-sized ints. I'm looking into that.

I think a reasonable implementation would be to use abs for 16-bit-sized ints/word and below, casting fro and back, and labs for 32-bit sized ints and above, but I'll check that the results are correct.

We could avoid casts by expanding Abs to something like:

(x + (x >> (8*sizeof(int64_t) - 1))) ^ (x >> (8*sizeof(int64_t) - 1));

replacing int64_t with whatever type Abs should be applied to.

joeyh commented 2 years ago

Probably too late to matter, but: I could make arduino-copilot deal with tgmath being used, I'm sure. Modifying the Makefile would not be my preferred option, for one it's intrusive and also arduino-copilot users may not only use the Makefile but may opt to load the generated C code into the Arduino IDE and flash it that way.

Also I checked Zephyr and it seems to include tgmath.h, although I have not verified it works there.

robdockins commented 2 years ago

It's very borderline though, and I could be convinced of the opposite, especially if we can make a good argument in favor of verification. And, more generally, I would be interested in seeing if it's possible to optimize that boilerplate code away in some other clean way.

Regarding the style of verification we are doing: I don't think it makes much difference wether copilot-c99 selects the correct operations, or the C compiler does via tgmath macros. We receive the code to verify as LLVM intermediate representation, so how that choice gets made is not very relevant.

Regarding abs, I think it might be nicer to do something a little less bit-fiddly. One fairly nice option, I think, would be to define a few static inline functions that do the necessary. E.g.:

static inline int32_t iabs32 ( int32_t x ) {
  return (x < 0 ? (-x) : x);
}

These should compile away nicely while still giving human reviewers something reasonable to look at.

ivanperez-keera commented 2 years ago

We receive the code to verify as LLVM intermediate representation, so how that choice gets made is not very relevant.

Right. I was also thinking about other possible verification efforts not just based on the bytecode, but the code generated at the moment is not for "human consumption" in general, so perhaps it's good to lean on the side of simplicity (that is, use tgmath.h) and change it in the future if needed.

Looking at the bytecode, it seems that what LLVM generates for tgmath.h is fairly benign. For example, for a simple function that uses fabs:

#include <stdint.h>
#include <tgmath.h>

float my_abs (float arg) {
  return fabs(arg);
}

LLVM generates:

define dso_local float @my_abs(float %0) #0 {
  %2 = alloca float, align 4
  %3 = alloca float, align 4
  store float %0, float* %3, align 4
  %4 = load float, float* %3, align 4
  store float %4, float* %2, align 4
  %5 = load float, float* %2, align 4
  %6 = call float @llvm.fabs.f32(float %5) #2
  ret float %6
}

; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.fabs.f32(float) #1

which is fairly close, and not much more complex, than what we get if we make the decision of which fabs variant from math.h to use:

define dso_local float @my_abs(float %0) #0 {
  %2 = alloca float, align 4
  store float %0, float* %2, align 4
  %3 = load float, float* %2, align 4
  %4 = call float @llvm.fabs.f32(float %3)
  ret float %4
}

It seems that the tgmath version passes data around more. I'm hopeful, in terms of verification, that it makes no difference.

Regarding abs, I think it might be nicer to do something a little less bit-fiddly. One fairly nice option, I think, would be to define a few static inline functions that do the necessary. E.g.:

static inline int32_t iabs32 ( int32_t x ) {
  return (x < 0 ? (-x) : x);
}

These should compile away nicely while still giving human reviewers something reasonable to look at.

The decision here is a bit related to what happened with sign. We decided to just generate the inlined expression ourselves, as opposed to creating a function. Whatever we chose, it would be good if possible to be reasonably consistent.

In terms of verification, I do have a question for you though. The version with shift is straightforward, and has no conditionals:

; Function Attrs: noinline nounwind optnone uwtable
define internal i32 @iabs32(i32 %0) #0 {
  %2 = alloca i32, align 4
  store i32 %0, i32* %2, align 4
  %3 = load i32, i32* %2, align 4
  %4 = load i32, i32* %2, align 4
  %5 = ashr i32 %4, 31
  %6 = add nsw i32 %3, %5
  %7 = load i32, i32* %2, align 4
  %8 = ashr i32 %7, 31
  %9 = xor i32 %6, %8
  ret i32 %9
}

However, a version like the one you (@robdockins) just wrote, with a conditional, generates bytecode with branching:

; Function Attrs: noinline nounwind optnone uwtable
define internal i32 @iabs32(i32 %0) #0 {
  %2 = alloca i32, align 4
  store i32 %0, i32* %2, align 4
  %3 = load i32, i32* %2, align 4
  %4 = icmp slt i32 %3, 0
  br i1 %4, label %5, label %8

5:                                                ; preds = %1
  %6 = load i32, i32* %2, align 4
  %7 = sub nsw i32 0, %6
  br label %10

8:                                                ; preds = %1
  %9 = load i32, i32* %2, align 4
  br label %10

10:                                               ; preds = %8, %5
  %11 = phi i32 [ %7, %5 ], [ %9, %8 ]
  ret i32 %11
}

Does it make a meaningful difference in terms of verification, especially if we have a very large spec to verify?

ivanperez-keera commented 2 years ago

Note that tgmath.h defines no abs, only fabs, so, contrary to what I originally thought, that one is still on us and the branch I created with a tgmath-based version is incomplete. We'd need to deal with abs for integers in Copilot.

robdockins commented 2 years ago

Regarding inline code generation versus auxiliary functions: agreed, consistency is good. Generating extra functions is a bit more involved, but probably worth the effort, in my opinion. Reasonable people could differ here.

Regarding tgmath.h vs math.h: looks like the macro in tgmath spills into an additional local variable. That should be no problem for the verifier. I expect this sort of thing would probably boil entirely away on -O1 and higher.

Regarding branching in abs: crucible is quite good a merging this sort of control-flow branching, so I don't expect that to be much of a problem for us. Other verification technologies that don't do path merging might have more trouble. FWIW, on -O1 and higher, I expect this would be turned into a select instead, and compiled into a conditional move instruction.

I would think it is more likely that we would have a potential problem with the SMT solver spending a fair amount of resources to notice that the bit twiddly version computes the same thing as the version based on conditionals (which I think is how the reference semantics are stated). I suspect it will have to bit-blast the expression to prove this equivalence, which can get pretty expensive for longer bit-widths. Of course, the proof is in the pudding. We could run some experiments.

ivanperez-keera commented 2 years ago

We could run some experiments.

If it takes more than 1/2h-1hr to check, it's probably a better use of time to default to the simplest expression possible and come back to it if it causes a problem.

fdedden commented 2 years ago

Just my quick view on the problem:

The general idea of Copilot is to be platform independent, but it is a known issue (since rewriting the C99 backend) that this is not really feasible with the current implementation. As pointed out in this issue, we will reach the point where we need to know a bit more about the platform.

One nice solution would be to have Copilot load a sort of 'platform configuration' that tells what the default sizes for numeric types are. While generating the C code we could then for example call the right function that belongs to i.e. an int16_t. This would require writing such a configuration for every target platform, but I don't that should be a big problem.

Myself I don't have experience using tgmath.h, but I am pretty sure there are some compiler and platform specifics that could cause trouble. It seems that these generic math functions aren't used that widely for that reason.

ivanperez-keera commented 2 years ago

We do need to close this issue. Any further comments, details, evidence?

robdockins commented 2 years ago

For my part, I think the solution in https://github.com/Copilot-Language/copilot/tree/develop-specialize-math-types is a fine way to go. I did some very quick tests in Cryptol, and solvers did not seem to have much trouble proving the shift/xor version of abs equal to one based on conditionals and negation, so that concern is probably not something we need to worry about.

ivanperez-keera commented 2 years ago

Great to know! Thanks for the quick reply.

ivanperez-keera commented 2 years ago

Description

The translation into C99 of specs with math operations, for most kinds of operations, does not generate code that uses type information to pick the function to be used. The use of functions that operate on types other than the ones the streams carry results in unintended, implicit type conversions (casts)

For example, to calculate the sine of a value in a stream we use the C function sin, even though a more specialized version for floats (sinf) exists. In Copilot, we could write:

import Language.Copilot
import Copilot.Language
import Copilot.Language.Prelude
import Prelude ()
import Copilot.Compile.C99

ext :: Stream Float
ext = extern "input_stream" Nothing

sined :: Stream Float
sined = sin ext

spec = do
  trigger "handler" true [ arg sined ]

main = do
  r <- reify spec
  compile "spec" r

The code generated by Copilot for such spec uses sin, which performs an unexpected conversion from float to double, which is expected not to have any loss (6.3.1.5 Real floating types, "When a float is promoted to double [...] its value is unchanged"), calculates the sine of the double number, and the casts it back to float:

static float input_stream_cpy;

float handler_arg0(void) {
  return (sin)((input_stream_cpy));
}

The result is not the same as if we use sinf on floats, as can be checked with the following test C program:

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

#include "spec.h"

float input_stream = 0;

int err_code = 0;

void handler (float arg) {
   float f = sinf(input_stream);
   if (f != arg) {
      float diff = f - arg;
      printf("Error: %.8f\t%.8f\t%.8f\t%.8f\n", input_stream, f, arg, diff);
      err_code = 1;
   }
}

int main()
{
    int i;
    for (i = 0; i < 1000; i++) {
        input_stream = (float)rand() / RAND_MAX;
        step();
    }
    return err_code;
}

When compiled and executed, that program presents a number of counterexamples:

$ gcc main.c spec.c -o demo -lm
$ ./demo 
Error: 0.49358299   0.47378424  0.47378427  -0.00000003
Error: 0.28429341   0.28047931  0.28047928  0.00000003
Error: 0.35036018   0.34323615  0.34323612  0.00000003
Error: 0.09348048   0.09334439  0.09334438  0.00000001
Error: 0.75729382   0.68695742  0.68695736  0.00000006
Error: 0.24804412   0.24550842  0.24550840  0.00000001
Error: 0.20221278   0.20083752  0.20083751  0.00000001
Error: 0.46366215   0.44722658  0.44722661  -0.00000003
Error: 0.44910511   0.43415955  0.43415958  -0.00000003
Error: 0.21088292   0.20932335  0.20932333  0.00000001
Error: 0.30540800   0.30068234  0.30068231  0.00000003
Error: 0.18761641   0.18651767  0.18651766  0.00000001
Error: 0.50346100   0.48245996  0.48245999  -0.00000003
Error: 0.51034904   0.48848182  0.48848185  -0.00000003
Error: 0.47755122   0.45960572  0.45960575  -0.00000003
Error: 0.12232582   0.12202097  0.12202097  0.00000001
Error: 0.46184778   0.44560304  0.44560307  -0.00000003
Error: 0.33289206   0.32677770  0.32677767  0.00000003
Error: 0.50240129   0.48153147  0.48153150  -0.00000003
Error: 0.59964442   0.56434894  0.56434900  -0.00000006

In general, it is best that we avoid implicit casts, especially when they lead to different results.

The same is also true for the functions recip, which assumes its value to be a double, and abs, which uses the C function abs regardless of whether the argument is an integer, a float, or a double, when different functions exist for each case. The case is more extreme with abs, which operates on integers and therefore discards the decimals when applied to float or double.

Due to the way that floats spread over the number range, the differences between the "cast" and "non-cast" results are expected to be much larger than the values presented in the last column of the output above.

Type

Additional context

None

Requester

Method to check presence of bug

The following spec performs a number of operations on floats:

import Copilot.Compile.C99
import Copilot.Language
import Copilot.Language.Prelude
import Language.Copilot
import Prelude ()

ext :: Stream Float
ext = extern "input_stream" Nothing

streamAbs :: Stream Float
streamAbs = abs ext

streamSin :: Stream Float
streamSin = sin ext

-- Ensure that argument to acosh is higher than 1 so that it is within
-- range.
streamAcosh :: Stream Float
streamAcosh = acosh (1 + ext)

streamExp :: Stream Float
streamExp = exp ext

streamLog :: Stream Float
streamLog = log ext

streamSqrt :: Stream Float
streamSqrt = sqrt ext

spec = do
  trigger "handlerAbs"   true [ arg streamAbs ]
  trigger "handlerSin"   true [ arg streamSin ]
  trigger "handlerAcosh" true [ arg streamAcosh ]
  trigger "handlerExp"   true [ arg streamExp ]
  trigger "handlerLog"   true [ arg streamLog ]
  trigger "handlerSqrt"  true [ arg streamSqrt ]

main = do
  r <- reify spec
  compile "spec" r

The following C program compares the results from Copilot what would have been obtained using tgmath.h, which specializes the functions based on the types:

#include <tgmath.h>
#include <stdio.h>
#include <stdlib.h>

#include "spec.h"

float input_stream = 0;

int err_code = 0;

void assert_eq (char* name, float v1, float v2) {
   if (v1 != v2) {
      float diff = v1 - v2;
      printf("Error (%s):\t%.8f\t%.8f\t%.8f\t%.8f\n",
             name,
             input_stream,
             v1,
             v2,
             diff);
      err_code = 1;
   }
}

void handlerSin (float arg) {
   float f = sin(input_stream);
   assert_eq ("sin", f, arg);
}

void handlerAbs (float arg) {
   float f = fabs(input_stream);
   assert_eq ("abs", f, arg);
}

void handlerAcosh (float arg) {
   float f = acosh(1 + input_stream);
   assert_eq ("acosh", f, arg);
}

void handlerSqrt (float arg) {
   float f = sqrt(input_stream);
   assert_eq ("sqrt", f, arg);
}

void handlerExp (float arg) {
   float f = exp(input_stream);
   assert_eq ("exp", f, arg);
}

void handlerLog (float arg) {
   float f = log(input_stream);
   assert_eq ("log", f, arg);
}

int main()
{
    int i;
    for (i = 0; i < 2000; i++) {
        input_stream = (float)rand() / RAND_MAX;
        step();
    }
    return err_code;
}

Expected result

The program above will report a list of values for which there is a difference between applying the function on floats, vs casting to double, applying the function on doubles, and then casting back to float.

The program will fail (return an non-zero exit code) if any differences are found. The following is a selection of some of the values produced:

Error (log):    0.98835063  -0.01171776 -0.01171776 -0.00000000
Error (exp):    0.09492297  1.09957421  1.09957409  0.00000012
Error (sin):    0.18875165  0.18763287  0.18763286  0.00000001
Error (acosh):  0.51963967  0.97978687  0.97978681  0.00000006
Error (abs):    0.19273356  0.19273356  0.00000000  0.19273356

Desired result

The program above should report no differences and return with an exit code of 0.

Proposed solution

Generate code that picks the most specialized version of a function available based on the type of the arguments.

Further notes

The reporter is Ivan Perez because the bug was originally identified internally, and later reported (independently) by Galois.

ivanperez-keera commented 2 years ago

Change Manager: Confirmed that the bug is present by using method described and examining the result.

ivanperez-keera commented 2 years ago

Technical Lead: Confirmed that the bug should be addressed.

ivanperez-keera commented 2 years ago

Technical Lead: Bug scheduled for fixing in Copilot 3.9.

Fix assigned to: @ivanperez-keera .

ivanperez-keera commented 2 years ago

I am about to send and process the PR of this issue. If there are no more changes, then this will go as described in https://github.com/Copilot-Language/copilot/tree/develop-specialize-math-types.

There have been some recent-ish changes, so please @RyanGlScott take a look. A recent comment with the Change Request description contains more concrete examples that fail if the "wrong" function is picked.

A strange argument in favor of specializing the function used by copilot based on the type of floating-point number, as opposed to using those from tgmath.h, is that the code is being generated with extra parenthesis; for example:

float handlerExp_arg0(void) {
  return (exp)((input_stream_cpy));
}

I did not know that, but that appears to confuse the heck out of the compiler. When using tgmath.h, if there are extra parenthesis, it seems to pick the function operating on double, and if the extra parenthesis are removed, then it picks the one operating on float.

I believe I have been spoiled by decades of Haskell's type inference mechanisms.

It is purely an engineering and economic decision (we might have to change the pretty printer and possible even language-c99-simple to fix that), but it helps decide between two options that are otherwise similarly good.

ivanperez-keera commented 2 years ago

I made a mistake when cleaning the commits and left a change behind, but the tests caught the problem. I've just pushed to the same branch.

ivanperez-keera commented 2 years ago

It's all passing the tests on my side. When you have a chance, please let me know if there's anything else that you think needs changing.

RyanGlScott commented 2 years ago

Thanks, @ivanperez-keera! I tried out your branch locally and it behaves like I would expect on all of the code I've been testing. I noticed one minor formatting issue (for which I've left a comment on the relevant commit), but nothing too serious.

A strange argument in favor of specializing the function used by copilot based on the type of floating-point number, as opposed to using those from tgmath.h, is that the code is being generated with extra parenthesis

Hah. Funny enough, we've ran into the same issue on our side. Usually, if you invoke the isnan(x) function, it refers to a macro, but if you surround it with parentheses (e.g., (isnan)(x)), then Clang resolves isnan to a specific function over double! LLVM can be strange like that.

In any case, everything looks good on my end.

ivanperez-keera commented 2 years ago

Great! Fixed, pushed and waiting for tests / CI.

ivanperez-keera commented 2 years ago

I mean, it makes sense when you look at the definition (which includes parentheses only around the argument):

#  define isnan(x) __builtin_isnan (x)

I don't think this is specific to llvm.

ivanperez-keera commented 2 years ago

Implementor: solution implemented, review requested.

ivanperez-keera commented 2 years ago

Change Manager: Verified that:

ivanperez-keera commented 2 years ago

Change Manager: Implementation ready to be merged.

robdockins commented 2 years ago

Hah. Funny enough, we've ran into the same issue on our side. Usually, if you invoke the isnan(x) function, it refers to a macro, but if you surround it with parentheses (e.g., (isnan)(x)), then Clang resolves isnan to a specific function over double! LLVM can be strange like that.

Just wanted to chime in here that this is a general C-ism, not just for LLVM; surrounding a function name in parens explicitly disables macro expansion, so having the C99 pretty-printer do that isn't totally benign. Some C standard library functions are intended to be used as macros, and disabling macro-expansion may not work as expected.

ivanperez-keera commented 2 years ago

Thanks, @robdockins . Good point!

That's an interesting challenge from the point of view of creating a C99 AST and pretty printer. Normal algorithms to minimize the number of parenthesis might not work well, and, potentially, the AST should provide a way to say "between parentheses" even though ASTs would normally not do that because that's more concrete syntax.

I'll discuss it with Frank, who maintains the c99 library we are using atm.

robdockins commented 2 years ago

If I were designing a solution to this issue, I'd probably have two function call forms in the AST: one for "named" function calls (which take an identifier and a list of arguments, and the identifier is never parenthised, and might be a macro) and one for "expression" function calls (which takes an expression and a list of arguments, and the expression is always parenthised, and is never a macro).

ivanperez-keera commented 2 years ago

It's probably best to discuss that in the https://github.com/fdedden/language-c99 package, since it's likely the one that will have to address it.