Open gbody opened 5 years ago
Just for clarity: is this a regression or new behavior? The root cause is this:
move.w d3,-(a7) ; push only the low word of d3
...
move.w 8(a7),d3 ; trash the low word of d3
move.w 10(a7),d3 ; trash it again
...
move.w (a7)+,d3 ; restore
which Reko translates to (ignoring all the PSW flags):
sp_1 = fp - 2
wLoc02_2 = (word16) d3
d3_3 = DPB(d3, wArg0A, 0)
d3_4 = DPB(d3_3, wArg0C, 0)
d3_5 = DPB(d3_4, wLoc02_2, 0)
sp_6 = fp
As you can see, there is a chain of DPB expressions that are preserving the high 16-bits of d3. Reko's value propagation algorithm is failing to discover that those high bits could be propagated:
sp_1 = fp - 2
wLoc02_2 = (word16) d3
d3_3 = DPB(d3, wArg0A, 0)
d3_4 = DPB(d3, wArg0C, 0)
d3_5 = DPB(d3, (word16) d3, 0)
sp_6 = fp
That final d3_5 = DPB(d3, (word16) d3, 0)
is a no-op and can be replaced with d3_5 = d3
. Once that is done, later stages would eliminate wLoc02_2 = (word16) d3
and you would no longer see d3
live-in to your procedure.
This should be a simple fix.
I spoke too soon: the "real" reason why d3
is kept alive is because of the following phi
nodes:
wLoc02_2 = (word16) d3
if (something)
d3_3 = DPB(d3, wArg0A, 0)
else
d3_4 = DPB(d3, wArg0C, 0)
d3_5 = PHI(d3_3, d3_4)
d3_6 = DPB(d3_5, (word16) d3, 0)
The PHI
statement makes it much harder to prove that the high words of the DPB
s are not modified as it hides the 'dpb-ness' of d3_3
and d3_4
.
Perhaps the following approach would work. Consider d3_6
, which is defined by a DPB(d3_5, ...)
. We want to propagate the knowledge that the high word of d3_5
is the same as the high word of d3
. Since d3_5
is defined as the PHI
of d3_3
and d3_4
, we can perform a transitive traversal of d3_3
and d3_4
. We would discover all the reaching definitions DPB(d3, wArg0A, 0)
and DPB(d3, wArg0C, 0)
have the same high word d3
. That would allow Reko to substitute the d3_5
with d3
, yielding:
wLoc02_2 = (word16) d3
if (something)
d3_3 = DPB(d3, wArg0A, 0)
else
d3_4 = DPB(d3, wArg0C, 0)
d3_5 = PHI(d3_3, d3_4)
d3_6 = DPB(d3, (word16) d3, 0)
which simplifies to:
wLoc02_2 = (word16) d3
if (something)
d3_3 = DPB(d3, wArg0A, 0)
else
d3_4 = DPB(d3, wArg0C, 0)
d3_5 = PHI(d3_3, d3_4)
d3_6 = d3
The notion of "transitive closure of all reaching PHI nodes" occurs elsewhere in the decompiler. An example is in the ConditionCodeEliminator
, which needs to perform a transitive closure of all PHI nodes and assignments to track which statements are setting the condition codes we're trying to eliminiate.
A special DPB
specific analysis could make use of the "transitive closures" to discover this. I'd still have to do some more research to determine whether this could be generalized to also cover ConditionCodeEliminator
, as I'd rather have general algorithms than special purpose ones wherever possible. Ideally, this transformation would be part of the ValuePropagator
transformation.
After look at this some more, it's beginning to dawn upon me that DPB
is not very suitable for program analysis. It's making the analyses very awkward because it's not as composable as SEQ
. That is, analyzing:
move.w 42(a3),d3
is easier to deal with if the resulting IR is:
tmpLo = Mem0[a5 + 42:word16]
tmpHi = SLICE(d3, word16, 16)
d3 = SEQ(tmpHi, tmpLo)
rather than today's
tmpLo = Mem0[a5 + 42:word16]
d3 = DPB(d3, tmpLo, 0)
The proposed change actually mirrors what is happening with stack-based variables, where no DPB
s are generated. Making the change would require changes to the SSA transformer for register variables. I will have to continue my experiments to see if I can resolve the issue.
Using the the analysis-development branch I am seeing the issue where D3 is being included in the procedure declaration in linkiigs at 001056f8 INITCURSORCTL. It appears that D3 is pushed onto the stack at the start of the function and restored from the stack at the end of the function. It appears that D3 is only being used after having a value assigned.
I have included a project file for linkiigs, that has all but four of the procedures named and matched against library procedures/functions.
LinkIIGS.Hqx-2.zip