Interlisp / medley

The main repo for the Medley Interlisp project. Wiki, Issues are here. Other repositories include maiko (the VM implementation) and Interlisp.github.io (web site sources)
https://Interlisp.org
MIT License
376 stars 19 forks source link

BIGNUMs don't work in 64-bit builds #90

Closed masinter closed 3 years ago

masinter commented 3 years ago
/**********************************************************************/
int N_OP_times2(int tosm1, int tos) {
  register int arg1, arg2;
  register int result;

  N_GETNUMBER(tosm1, arg1, doufn);
  N_GETNUMBER(tos, arg2, doufn);

#ifdef SUN3_OS3_OR_OS4_IL

  result = mpy32(arg1, arg2);
  N_ARITH_SWITCH(result);
dummy:
  mpy_err_label();

#else

  result = arg1 * arg2;
  if ((arg2 != 0) && ((result / arg2) != arg1)) goto doufn2;
  N_ARITH_SWITCH(result);

#endif

result is a 64-bit number. So (CL:* 300000 300000) will succeed and N_ARITH_SWITCH will cons up a FIXNUM rather than calling the UFN or allocating a bignum. mppy32 is just inline SPARC assembly code.

waywardmonkeys commented 3 years ago

Does this work on SPARC? When I look at the SPARC code, it feels to me like like it is an error:

.inline _sub32,8
  subcc %o0,%o1,%o0 ! result = arg0 - arg1
  bvs diff_err
  nop
.end

.inline _mpy32,8
  ba  mpy_err
  nop
.end

The sub32 routine does the operation and then branches if the carry condition code is set. The mpy32 routine just does an unconditional branch though.

Which 64 bit platform were you on? macOS and Linux are LP64, so the int result should still have been a 32 bit value and resulted in going to doufn2 ... or what am I missing so that I can understand this better?

waywardmonkeys commented 3 years ago

Okay, so this is readily reproducible with a simple C example on macOS or Linux with "modern" clang or gcc. I am not at home, so I'm on macOS rather than Linux, so examples will be from macOS and clang.

Modern clang and gcc have optimizations that take advantage of overflow of signed integers being undefined behavior, so they allow themselves to eliminate code that assumes that undefined behavior happened. This has been pretty upsetting to many and surprising to even more over the last decade or more. This is a good paper from 2012 that covers this: https://www.cs.utah.edu/~regehr/papers/overflow12.pdf

If we take this bit of code and compile it with clang -O2 -m64 -std=gnu89 -funsigned-char -fno-strict-aliasing:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
  register int arg1, arg2;
  register int result;

  arg1 = atoi(argv[1]);
  arg2 = atoi(argv[2]);

  result = arg1 * arg2;
  if ((arg2 != 0) && ((result / arg2) != arg1)) {
    printf("Overflowed.\n");
  } else {
    printf("Good: %d\n", result);
  }
  return 0;
}

It won't work as expected. If we look at the generated assembly, we'll see that the whole check and print of "Overflowed." has been removed by the compiler, because for it to have succeeded, it means that undefined behavior must have occurred:

  movq  %rsi, %rbx
  movq  8(%rsi), %rdi
  callq _atoi
  movl  %eax, %r14d
  movq  16(%rbx), %rdi
  callq _atoi
  imull %r14d, %eax
  leaq  L_.str.1(%rip), %rdi
  movl  %eax, %esi
  xorl  %eax, %eax
  callq _printf
  xorl  %eax, %eax
  popq  %rbx
  popq  %r14
  popq  %rbp
  retq

Now, we can add -fwrapv to the compiler command line, which instructs both gcc and clang to not perform that sort of optimization, and that fixes things:

  movq  %rsi, %rbx
  movq  8(%rsi), %rdi
  callq _atoi
  movl  %eax, %r14d
  movq  16(%rbx), %rdi
  callq _atoi
  movl  %eax, %esi
  imull %r14d, %esi
  testl %eax, %eax
  je  LBB0_3
## %bb.1:
  movl  %eax, %ecx
  movl  %esi, %eax
  cltd
  idivl %ecx
  cmpl  %r14d, %eax
  jne LBB0_2
LBB0_3:
  leaq  L_.str.1(%rip), %rdi
  xorl  %eax, %eax
  callq _printf
  jmp LBB0_4
LBB0_2:
  leaq  L_str(%rip), %rdi
  callq _puts
LBB0_4:
  xorl  %eax, %eax
  popq  %rbx
  popq  %r14
  popq  %rbp
  retq

That's a trivial fix and it has been experimented with before as evidenced by some commented out compiler command lines in the repository.

However, with clang and gcc, there's another alternative as well, which looks like this in C:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
  register int arg1, arg2;
  int result;

  arg1 = atoi(argv[1]);
  arg2 = atoi(argv[2]);

  if (__builtin_smul_overflow(arg1, arg2, &result)) {
    printf("Overflowed.\n");
  } else {
    printf("Good: %d.\n", result);
  }
  return 0;
}

Here, we're using a __builtin_smul_overflow primitive, which generates this code, without the extra idivl. It is worth noting that we had to remove register from the result variable since you can't take the address of a register variable. This doesn't matter though and doesn't actually change the generated code at all. (I tested in the original code sample with and without the register modifier.)

  movq  %rsi, %rbx
  movq  8(%rsi), %rdi
  callq _atoi
  movl  %eax, %r14d
  movq  16(%rbx), %rdi
  callq _atoi
  imull %r14d, %eax
  jno LBB0_2
## %bb.1:
  leaq  L_str(%rip), %rdi
  callq _puts
  jmp LBB0_3
LBB0_2:
  leaq  L_.str.1(%rip), %rdi
  movl  %eax, %esi
  xorl  %eax, %eax
  callq _printf
LBB0_3:
  xorl  %eax, %eax
  popq  %rbx
  popq  %r14
  popq  %rbp
  retq

Just like before, it is performing an imull instruction, but now, instead of doing the extra math, we're just doing a jo or "jump when overflow flag is set".

This is the same thing that the SPARC assembly code was doing above (for addition), but here with gcc and clang, it is able to do this for each supported platform, without custom assembly code.

I think there's at least 4 things that we should do here:

  1. Revert the commit made in interlisp/maiko#100 as it isn't correct.
  2. Add -fwrapv to the clang and gcc compile flags for now.
  3. Add #ifdef for clang and gcc to use the overflow-checking builtins, and then remove -fwrapv once that is in place.
  4. Write a test that we can run as part of a C test suite (none of this exists yet) that verifies that things are correctly functioning and correctly failing, so that we can catch this during bringing up a new platform or compiler in the future.
nbriggs commented 3 years ago

neither the Oracle Developer Studio compiler for SPARC nor the corresponding one for x86/x86_64

$ cc -V
cc: Studio 12.6 Sun C 5.15 SunOS_sparc 2017/05/30

do that particular optimization.

nbriggs commented 3 years ago

@waywardmonkeys --

  1. I've reverted the commit
  2. I think we could standardize on -std=gnu89 -funsigned-char -fno-strict-aliasing -fwrapv for the gcc and clang compiles in the makefile fragments we distribute. I don't think the standard distribution needs to have compiler warnings turned way up, that's only of interest to people poking around more deeply and they can edit the options if they want.
  3. Should it be conditional on clang/gcc or __has_builtin(builtin_smul_overflow) [and friends, it's not just signed multiply that would benefit] with an appropriate definition of has_builtin for compilers that don't support it.
  4. Test cases would be useful. They can be standalone to test for features/behavior that we're depending on.
masinter commented 3 years ago

the overflow test doesn't have to be exact -- as long as it is conservative. I've been working on the lisp-based tests in Medley/internal test -- one of the files wouldn't compile because (constant (itimes 3.14159e10 1)) wouldn't fasl dump. d\

waywardmonkeys commented 3 years ago

@masinter If you try maiko from current master, this should be working now. I want to leave this open for now until we do at least step 3 above and get started towards step 4.

masinter commented 3 years ago

works for me now. I did get a message in the startup window "Failed to find UNIXCOMM file handles; no processes" and IL:ShellCommand("ls -l") now errors but that is the other update

waywardmonkeys commented 3 years ago

We've made the code support using the overflow builtins. I think we could remove -fwrapv from the build at this point ...