WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly
Apache License 2.0
7.49k stars 741 forks source link

DAE Improvement idea #4941

Open MaxGraey opened 2 years ago

MaxGraey commented 2 years ago

Currently, DAE pass create only one truncated copy of a function, one of whose arguments are constantly called with the same constant value. This works well for functions with default values. but there is also another class of functions where, for example, the input argument parametrizes the execution strategy and often has fast paths. For example (pseudocode):

function pow(x: f64, y: f64): f64 {
   if (y == 0.0) return 1.0;
   if (y == 1.0) return x;
   if (y == 2.0) return x * x;

   // use complex computations
}

pow(a, 0.0);
pow(a, 1.0);
pow(a, 2.0);
pow(a, 3.5);
pow(a, b);

For such cases DAE could also work very well if you improve the heuristics and make truncated copies of all constant values that were found in the "if" / "switch / case" conditions. So improved DAE could create truncated 4 specialized copies:

function powY0(): f64 {
  return 1.0;
}

function powY1(x: f64): f64 {
  return x;
}

function powY2(x: f64): f64 {
  return x * x;
}

function powY3_5(x: f64): f64 {
   // without fast pathes

   // use complex computations specified for y=3.5 
}

function pow(x: f64, y: f64): f64 {
  // original
}

powY0(); // -> 1.0 after inlining
powY1(a); // -> a after inlining
powY2(a); // -> a * a after inlining
powY3_5(a);
pow(a, b);

WDYT?

kripken commented 2 years ago

This might make sense, yes. It's like monomorphization but that is usually referring to type specialization instead of values like here.

However, inlining will achieve similar results if the called function is small enough, so I'm not sure how much that would help in practice on real-world code.

MaxGraey commented 2 years ago

However, inlining will achieve similar results if the called function is small enough, so I'm not sure how much that would help in practice on real-world code.

Yes, if pow is relatively small, but it's not a case in most cases. That's btw why this pass better run after inline pass but not as final pass