lifting-bits / remill

Library for lifting machine code to LLVM bitcode
Apache License 2.0
1.22k stars 143 forks source link

Potential Solution for CALL_RETs from delay slot restore in Sleigh SPARC #680

Closed 2over12 closed 11 months ago

2over12 commented 11 months ago

Opening this to start a conversation.

Alright so here is a solution... Goal here is we track a constant didrestore. If the cbranch is constant it gets flipped to a branch or ignored depending on the value from the delay slot instruction. This could fail if the delay slotted instruction has intra control flow.

What I really dont like about this instruction is this:

        return this->RedirectControlFlow(bldr, *copy_inval,
                                         opc != CPUI_CALLIND);

This is forced because we cant terminate the block if we have pcode with

CALL
INT_EQUAL
CBRANCH
RET

We need to allow the call effect to occur and return... if we start lifting int equal after the terminate we will have branch exit INT_EQUAL

But then I realized that these pcode semantics are "broken". When the callee does a return the pcode specs dont imply that it goes to the middle of the pcode. The semantics for CALL is just a branch to the address. This whole artifact only really works in decompilation since the RET gets inlined nicely and optimized away as needed.

I think the appropriate solution here is to change the semantics. The semantics for call should just be:

delay slot stuff
CALL

Then we use the control flow override CALL_RET in anvill to perform the call then ret if needed. As it stands this PR specializes remill towards a decompiler with specific behavior which is not ideal

Additionally, stuff like "the call returns past the delay slot" shouldnt really be a low level semantics thing imo. It exists at the level of flows/disassembly etc but if we are emulating the CPU whatever these semantics dont make sense

2over12 commented 11 months ago

it's weird because I'd expect flow overrides to be sufficient here in Ghidra but the decompiler seems not to pay attention to them.

ie. see this instruction: image

I modified the specs accordingly to get rid of the cbranch+ret and the fallthrough is overridden to the appropriate address:

Screenshot 2023-07-23 at 11 52 22 AM. I would therefore expect the decompiler to stitch the call->the add instruction.

That doesnt happen though: Screenshot 2023-07-23 at 11 53 12 AM

I'd like to make Ghidra work this way though to make instruction semantics more in line with the actual spec rather than what happens to work with the decompiler...

Ninja3047 commented 11 months ago

closing since we aren't planning on merging this