Open pgoodman opened 3 years ago
post_inst_reg_state
is basically reaching definitions :-)
I think it'd also be interesting to do a form of equality saturation over the IRs. This is be beneficial for loads/stores being converted into things like typed field accesses.
This issue serves to track ideas and goals related to integrating instruction semantics.
Basic example:
From this, we'd like to create the following approximate relations:
In the above, things like
REG_EDI
would be the unique IDs of values for the registers. So probablyraw_operation
wouldn't start at0
.From here, we'd want rules that do some basic things like copy propagation. This means definition the post-instruction register state. These rules would exist in a separare datalog db, with one instance per function. The idea would be to run these, have them publish messages that we'd store back into the main db, then destroy these instances until they're needed again. Key idea: throwaway databases.
I think with the simple rules above, execution would converge toward the following:
For every point in a function, we would be able to express register values in terms of register values from another place in the function. This is something that the GrammaTech people mentioned in their ddisasm paper. Basically, we could say: there exists a path such that the register written at instruction EA1 is read by the instruction at EA2.
I think one thing that becomes apparent from this type of "raw operation" representation is that we'd want to represent conditional register writes that either preserve the register contents or alter them, that way we can model those data-centric flows.