google / souper

A superoptimizer for LLVM IR
Apache License 2.0
2.13k stars 170 forks source link

sclang crash - inst operands are constants! #486

Open pranavk opened 5 years ago

pranavk commented 5 years ago

Here's a reduced file (reduced using creduce)

int a, b;
int c() {

  switch (b) {
  case 4:
    for (; a > 0; --a)
      ;
  default:
    for (; 0 < a;)
      ;
  }
  return 0;
}

Compile like this: ./sclang -c -O3 -mllvm -souper-synthesis-comps=add,sub -mllvm -souper-infer-inst /tmp/tmp.c

mateon1 commented 5 years ago

The issue is that "junk removal" simplifies a synthesized instruction down to a constant, which later hits the sanity check. Here's an example: (not at all minified)

%0 = block 2
%1:i64 = var
%2:i64 = var
%3:i64 = var
%4:i64 = xor %2, %3
%5:i64 = add %1, %4 (hasExternalUses)
%6:i64 = fshl %4, %4, 16:i64
%7:i64 = xor %5, %6
%8:i64 = var
%9:i64 = var
%10:i64 = add %8, %9 (hasExternalUses)
%11:i64 = fshl %10, %10, 32:i64
%12:i64 = add %7, %11 (hasExternalUses)
%13:i64 = fshl %7, %7, 21:i64
%14:i64 = xor %12, %13 (hasExternalUses)
%15:i64 = phi %0, 6409219367911750826:i64, %14
infer %15

Needs -souper-synthesis-comps=const,sub,xor, but this can be triggered without a const component! I just lost that case in the scrollback among thousands of lines of debug output, and it takes a good half hour to manifest from scratch anyway.

I can't get souper-check to output synthesis debug output, so here's the tail of the log from the actual program (souper called via llvm's opt) that included this snippet:

synthesizing using 2 component(s)
6 <= 3_0 (output,i64) < 8
setting input in_0_1 to 0
setting input in_0_4 to 0
setting input in_0_5 to 0
setting input in_0_2 to 0
setting input in_0_3 to 0
setting input in_0_1 to 1
setting input in_0_4 to 1
setting input in_0_5 to 1
setting input in_0_2 to 1
setting input in_0_3 to 1
setting input in_0_1 to 3
setting input in_0_4 to 3
setting input in_0_5 to 3
setting input in_0_2 to 3
setting input in_0_3 to 3
setting input in_0_1 to 2
setting input in_0_4 to 2
setting input in_0_5 to 2
setting input in_0_2 to 2
setting input in_0_3 to 2
setting input in_0_1 to 7
setting input in_0_4 to 6
setting input in_0_5 to 7
setting input in_0_2 to 7
setting input in_0_3 to 7
setting input in_0_1 to 15
setting input in_0_4 to 14
setting input in_0_5 to 15
setting input in_0_2 to 15
setting input in_0_3 to 15
solving synthesis constraint.. SAT
line    locations
0   0_1 (input,i64) 0_2 (input,i64) 0_3 (input,i64) 0_5 (input,i64) 
1   0_4 (input,i64) 1_1 (sub,i64) 1_2 (sub,i64) 
2   0_6 (const,i64) 2_2 (xor,i64) 
6   1_0 (sub,i64) 2_1 (xor,i64) 
7   2_0 (xor,i64) 3_0 (output,i64) 
found valid wiring, output 2_0 (xor,i64).
creating program from wiring
- starting with OutLoc 2_0 (xor,i64), OpLocs { 2_1 (xor,i64) 2_2 (xor,i64) }
- looking for OpLoc wiring 2_1 (xor,i64)
- found wiring input on line 6, taking 1_0 (sub,i64)
- continue with OutLoc 1_0 (sub,i64), OpLocs {1_1 (sub,i64) 1_2 (sub,i64) }
- looking for OpLoc wiring 1_1 (sub,i64)
- found wiring input on line 1, taking 0_4 (input,i64)
- continue with OutLoc 0_4 (input,i64), OpLocs {}
- creating input inst 0_4 (input,i64)
- looking for OpLoc wiring 1_2 (sub,i64)
- found wiring input on line 1, taking 0_4 (input,i64)
- continue with OutLoc 0_4 (input,i64), OpLocs {}
- creating input inst 0_4 (input,i64)
- creating inst sub, width 64
before junk removal:
%16:i64 = sub %8, %8
result %16
- looking for OpLoc wiring 2_2 (xor,i64)
- found wiring input on line 2, taking 0_6 (const,i64)
- continue with OutLoc 0_6 (const,i64), OpLocs {}
- creating constant inst 0_6 (const,i64) with value 6409219367911750826
- creating inst xor, width 64
before junk removal:
%17:i64 = xor 0:i64, 6409219367911750826:i64
result %17
LLVM ERROR: inst operands are constants!
regehr commented 5 years ago

this is a weird one since if the result is just a constant, we should have synthesized that first, before trying to synthesize any instructions. not sure why that didn't work here. @rsas do you know what's up here?

mateon1 commented 5 years ago

This seems to be some mismatch between the solver constraints at synthesis and at counterexample checking, the IR within this method hits some edge-case where the synthesizer wants to synthesize a constant, but then the counterexample query just returns zeros for all vars. After I disable the "inst operands are constants" check, I often see 50+ refinements before synthesis gives up for that iteration and continues with more components/other IR.

mateon1 commented 5 years ago

I think I see the issue (but not 100% certain since I don't fully understand the synthesis constraints). At synthesis, blockpred constants decide which path is taken at each phi node, BUT - the solver can choose ANY blockpreds at synthesis - they are not part of the inputs/counterexamples, so if the result is a phi node that chooses between a computation and a constant, the synthesizer will ALWAYS return that constant, since it can just choose the blockpreds that lead to that constant.

The solution would ideally be to encode a "forall" constraint over blockpreds (which is actually doable since blockpreds are one-bit bitvecs), but that's probably impractical (might blow up the constraint size exponentially). The other solution is attaching the chosen blockpreds to each counterexample, but I'm not sure how much code that would affect.