Open pfalcon opened 6 years ago
2.
Distinguishing feature of PseudoC is that it allows complex-ish expressions in assignments, not just 3-address expressions, to model CISC and other adhoc features, e.g.:
$eax = *(u32*)($ebx + $ecx * 8 + 3)
But that also poses a problem, because SABl sees that expression as a whole, and can't propagate subexpressions of it. Sometimes, that can lead to obvious problems. For example,
$a3 = UINT64($a7, $a6) >> $SAR
would rather be:
$a7_a6 = UINT64($a7, $a6)
$a3 = (u64)$a7_a6 >> $SAR
That would make an implicit point where we get a 64-bit vreg, and that would allow to simplify expressions much better.
One can argue that this is a problem of input PseudoC, but again, it's a distinguishing feature that it allows to map a single machine instruction to a single PseudoC statement for as wide number of architectures as possible. So, instead, there should be "deconstruction" pass in SABl itself.
3.
Extending on that "back and forth processing" idea further. 1. says "Complex expressions should not be propagated to multiple places, as that makes the code more complex." That's of course not true. For example, suppose we have $r1 = $r0 + 1
and that can be propagated into 2 places. Should that be done? Naive answer is "no". But the answer is "yes" if one place to propagate is $r1 - 1
.
It's hard to tell whether a particular propagation is useful or not. So, the only general approach is to propagate eagerly and widely, but then have a CSE pass, to undo any "useless" propagations.
The most obvious one: