To improve the performance/scalability of the comparative analysis, we want to break pairs of functions into pairs of sub-graphs of the CFGs which should have equivalent behaviors.
Following some insightful suggestions by Tristan Ravitch, we would like to proceed as follows:
Recursively descend into both CFGs from the starting point, and at each branch, create a new subroutine up to that point, and check equivalence of those subroutines. Then build a correspondence between branch conditions, so that we know how to match up the subsequent pairs of blocks.
Then the algorithm might end up doing the following:
Compare do_something1 and do_something2 for full program equivalence (registers and memory)
Check that (after executing do_something1) cond1 => cond2 or alternately cond1 => !cond2
In the first case, assume equal inputs, and check that branch_1_1; finish1 and branch_2_1;finish2 are fully equivalent; in the second case branch_1_1; finish1 and branch_2_2; finish2 etc.
Conclude that the two programs are fully equivalent.
Issues:
This requires full memory equivalence, which we have noted can be very costly. Again, the suggestion here was to heuristically gather symbolic or concrete locations we believe are the writes, and check equivalence for only these locations, along with checking that writes are only indeed made at these locations.
This does not easily allow for properties other than full equivalence. More thought is needed here, but perhaps we can handle equivalence of some initial segment, and simplify the comparative analysis by restricting to the non-equivalent sub-sections of the CFG.
To improve the performance/scalability of the comparative analysis, we want to break pairs of functions into pairs of sub-graphs of the CFGs which should have equivalent behaviors.
Following some insightful suggestions by Tristan Ravitch, we would like to proceed as follows:
Recursively descend into both CFGs from the starting point, and at each branch, create a new subroutine up to that point, and check equivalence of those subroutines. Then build a correspondence between branch conditions, so that we know how to match up the subsequent pairs of blocks.
For example, looking at a pair of functions
Then the algorithm might end up doing the following:
do_something1
anddo_something2
for full program equivalence (registers and memory)do_something1
)cond1 => cond2
or alternatelycond1 => !cond2
branch_1_1; finish1
andbranch_2_1;finish2
are fully equivalent; in the second casebranch_1_1; finish1
andbranch_2_2; finish2
etc.Conclude that the two programs are fully equivalent.
Issues:
This requires full memory equivalence, which we have noted can be very costly. Again, the suggestion here was to heuristically gather symbolic or concrete locations we believe are the writes, and check equivalence for only these locations, along with checking that writes are only indeed made at these locations.
This does not easily allow for properties other than full equivalence. More thought is needed here, but perhaps we can handle equivalence of some initial segment, and simplify the comparative analysis by restricting to the non-equivalent sub-sections of the CFG.