temperlang / temper

3 stars 0 forks source link

Bad code generation: if/return + bubble #150

Closed ShawSumma closed 2 months ago

ShawSumma commented 2 months ago

A Minimal Case

Lets say we wanted to write a mod function that checks for errors.

Working Case 1

This is the code that does that.

// WORKS (valid because it does what i want)
export let mod(n: Int, d: Int): Int | Bubble {
  if (d == 0) {
    bubble();
  } else {
    return n % d;
  }
}

Working Case 2

We can invert the if by inverting the condition and swapping the true/then case and false/else case.

// WORKS (still valid, because our transform is valid)
export let mod(n: Int, d: Int): Int | Bubble {
  if (!(d == 0)) {
    return n % d;
  } else {
    bubble();
  }
}

Broken Case

We use the fact that the true/then case always returns (or calls bubble() in theory) to make it a guard-like statment. Note that an actual guard statement would check for bad values and then fail; We not check for good values and then succeed.

Either way here's the new code.

// DOES NOT WORK (still valid, because our transform is also valid, but bugged)
export let mod(n: Int, d: Int): Int | Bubble {
  if (!(d == 0)) {
    return n % d;
  }
  bubble();
}

Generated Code

Here's the python output (all comments are added after compilation to show the issue).

from builtins import int as int19, RuntimeError as RuntimeError16
from temper_core import Label as Label22
def mod(n_1: 'int19', d_2: 'int19') -> 'int19':
  with Label22() as fn_3:
    if d_2 != 0:
      return_0 = n_1 % d_2
      fn_3.break_()
    raise RuntimeError16()

The Issue

This code now shows the bug on all backends; This has to be an issue with our Intermediate representation, TmpL.

My Solution

My proposal is to stop using a break after the return variable assignment and instead to use a return after declaring the return varaible. This helps in several ways.

Other Things

Fix the control flow around if. I bet this is still a good idea if my solution is adopted.

tjpalmer commented 2 months ago

This issue is breaking temper-regex-engine right now, and it used to work. I suspect it's a somewhat recently introduced bug. Maybe the new decompiler has a hole in it somewhere.

mikesamuel commented 2 months ago

Below is the IR code at the end of GenerateCode. It looks like any mistransformation is probably during TmpL translation.

        @reach(\none) let mod1__0, @reach(\none) mod2__0, @reach(\none) mod3__0;
        mod1__0 = (@stay fn mod1(n__0 /* aka n */: Int, d__0 /* aka d */: Int) /* return__0 */: (Int | Bubble) {
            var fail#0;
            if (d__0 == 0) {
              bubble()
            } else {
              return__0 = hs(fail#0, n__0 % d__0);
              if (fail#0) {
                bubble()
              }
            }
        });
        mod2__0 = (@stay fn mod2(n__1 /* aka n */: Int, d__1 /* aka d */: Int) /* return__1 */: (Int | Bubble) {
            var fail#1;
            if (!(d__1 == 0)) {
              return__1 = hs(fail#1, n__1 % d__1);
              if (fail#1) {
                bubble()
              }
            } else {
              bubble()
            }
        });
        mod3__0 = (@stay fn mod3(n__2 /* aka n */: Int, d__2 /* aka d */: Int) /* return__2 */: (Int | Bubble) {
            fn__0: do {
              var fail#2;
              if (!(d__2 == 0)) {
                return__2 = hs(fail#2, n__2 % d__2);
                if (fail#2) {
                  bubble()
                };
                break fn__0;
              };
              bubble()
            }
        })
mikesamuel commented 2 months ago

Here's the TmpL translation:

            //// foo.temper => foo.tmpl
            let nym`==#32` = builtins.nym`==` /* fn (lhs: Int | Null, rhs: Int | Null): Boolean */;
            let nym`%#33` = builtins.nym`%` /* fn (x: Int, y: Int): Int | Bubble */;
            @reach(\none) let mod1__0(n__0: Int, d__0: Int): Int | Bubble {
              let return__0: Int;
              @fail var fail#0: Boolean;
              if (nym`==#32`(d__0, 0)) {
                return failure;
              } else {
                return__0 = hs (fail#0, nym`%#33`(n__0, d__0));
                if (fail#0) {
                  return failure;
                }
              }
              return return__0;
            }
            @reach(\none) let mod2__0(n__1: Int, d__1: Int): Int | Bubble {
              let return__1: Int;
              @fail var fail#1: Boolean;
              if (! nym`==#32`(d__1, 0)) {
                return__1 = hs (fail#1, nym`%#33`(n__1, d__1));
                if (fail#1) {
                  return failure;
                }
              } else {
                return failure;
              }
              return return__1;
            }
            @reach(\none) let mod3__0(n__2: Int, d__2: Int): Int | Bubble {
              fn__0: {
                @fail var fail#2: Boolean;
                if (! nym`==#32`(d__2, 0)) {
                  return__2 = hs (fail#2, nym`%#33`(n__2, d__2));
                  if (fail#2) {
                    return failure;
                  }
                  break fn__0;
                }
                return failure;
              }
            }

So it looks like the problem is in TmpL, specifically, the return variable is assigned, but it's not declared and no return return__2 statement is appended.

mikesamuel commented 2 months ago

Ok, I've narrowed the problem down to CanControlFlowLeave