Closed Escapingbug closed 3 years ago
Here's what I got.
We already have some of the implementation for:
getPcodePacked()
)Pcode injections could happen in two cases (that we shoud care):
callfixup
injectioncallotherfixup
injectionBy inspecting the implementation of these two injections, we might get an idea of how the pcode patching should be implemented.
Call fixup injection is for "modeling“ the called function with pcodes. So, callfixup injection can only happen at "call" or "callind" instruction.
Callotherfixup is for implementing custom userop
s. And it can be distinguished by pcode op CPUI_CALLOTHER
.
CPP part of the implementation is:
FlowInfo
is the class of translating instructions (and pcodes). Within FlowInfo
, the injectlist
contains all the pcode ops that need injection. (check flow.cc/hh
)FlowInfo::checkForFlowModification
, the isInline()
is checked to see if any callfixup should be done (to inline the subfuction). If so, record that in the injectlist
CallOtherFixup
, in FlowInfo::xrefControlFlow
it analyzes the control flow by inspecting the opcode, and when CPUI_CALLOTHER
is encountered (flagging a call other fixup situation), it is recorded to the injectlist
FlowInfo::injectPcode
. It goes through all injectlist
items and does the real injection.FlowInfo::doInjection()
when payload is found.glb->pcodeinjectlib
which is in the PcodeInjectLibrary
So, if we are gonna implement the pcode patching, what we should do at the cpp side is:
callotherfixup
or callfixup
)PcodeInjectLibrary
injectlist
. Those ops should be found by using Java-side given addr or something alike.injectPcode
, i.e, the else case other than CPUI_CALLOTHER
, CPUI_CALL
and CPUI_CALLIND
. Find the payload and inject. A proper imitation of the FlowInfo::injectSubFunction
is desired.To get the injections from the Java side, some modifications to PcodeInjectLibraryGhidra
is required (in inject_ghidra.hh/cc
.
Here we get the inject library from java side.
Previously we only have callfixups and callotherfixups.
Now we need one more.
The java side interact with cpp side with DecompileProcess
. in readResponse
, here we could see the back-call (cpp to java call).
And clearly could see getCallFixup
, getCallotherFixup
and getCallMech
responses.
This means we also need a new protocol semantic for implementing our patching. Something like getPcodePatchFixup
where cpp and java should understand simultaneously.
This is needed to modify both java part DecompileProcess
and cpp part.
The possible implementation of the negotiation procedure could be:
ArchitectureGhidra::getPcodeInject
(cpp) add one more inject payload type.DecompileProcess
(java) add one more protocol parsingNow we should deal with how the getPcodeInject(PCODEPATCHFIXUP)
should be implemented (in Java).
Summing up a little, what we need:
getPcodeInject
, we should be able to get it out.In the newest version of the database (24, updated in Mar. 2021) , it turns out the payload is already possible to reside in the database.
(ProgramDB.java
)
And the compiler spec has the PcodeInjectLibrary
in it which we could take advantage of.
So we just need to modify PcodeInjectLibrary
to contain our type of fixup.
And, we should allow dynamically add the injection payload to our type of fixups reocrded in PcodeInjectLibrary
(I mean it in java, same above).
Till now, the most part should work. The rest of the job is to:
InjectPayload
They should be simpler and should not cause much of a problem.
Closed due to project resturcture. This should reside in bincraft_ghidra now.
This is required for more flexible IR arrangements.
The background is that, currently the only way to modify semantic of the program is through instruction patching. However, the instruction patching has some drawbacks:
And, to be honest, those drawbacks are preventing strong analysis such as deobfuscating control flow flattening.
Obfuscations like control flow flattening would rearrange the basic blocks. But because of the drawbacks mentioned, no possible rearrangements can be done in Ghidra (or IDA). At least, not easily possible.
The solution of this problem is to allow pcode patching. That is, we allow user to display the raw-pcode and patch them.
What we need:
PcodeFormatter
.The reason of the last two is that the pcode is not stored in the database and is lifted each time by the sleigh engine as mentioned in this issue.
So maybe we could find out some way to bypass the translation and remember the last time lifted and use it for the pcode patching feature. Note that not all the functions need the pcode stored, only the ones patched. Or else we might have a database exploded in disk space.