Closed 2over12 closed 5 months ago
Regardless since we compute these flow hints we should keep them consistent in pcode. We should test this against conditional control flow in anvill. Closes https://github.com/lifting-bits/remill/issues/694
I won't merge this yet since it sounds like it still needs testing. Feel free to merge when you're ready.
Hi, just saw this PR and gave it a quick spin:
Looks good so far, the double conditions I had are gone now too, codegen is surprisingly quite a bit different in how it merges the tails now, bit weird but still looks correct on a quick look.
Also I have been meaning to take a look at anvill, quick question here: does it work with the Sleigh based backends in remill? What are the benefits of using anvill?
My use case will be for emulator AOT recompilation for games, so I am not sure how much i will benefit from it
Conditional call in libc seems to work anvill side:
define i8 @func949680basic_block949684_22(ptr %stack, i32 %program_counter, ptr noalias nocapture %memory, ptr noalias nocapture %D12, ptr noalias nocapture %D8, ptr noalias nocapture %R1, ptr noalias nocapture %D13, ptr noalias nocapture %R9, ptr noalias nocapture %D14, ptr noalias nocapture %D15, ptr noalias nocapture %D11, ptr noalias nocapture %LR, ptr noalias nocapture %D9, ptr noalias nocapture %R10, ptr noalias nocapture %R0, ptr noalias nocapture %R8, ptr noalias nocapture %R6, ptr noalias nocapture %R5, ptr noalias nocapture %R3, ptr noalias nocapture %R4, ptr noalias nocapture %R2, ptr noalias nocapture %D10, ptr noalias nocapture %R11, ptr noalias nocapture %R7) local_unnamed_addr #7 !__anvill_basic_block_uid_md !2 {
sleigh_remill_instruction_function_e7db8.exit:
%0 = load i32, ptr %LR, align 4
%1 = load i32, ptr %R11, align 4
%2 = icmp ne i32 %1, 0
%3 = icmp ne i32 %0, 0
%narrow.not = select i1 %2, i1 %3, i1 false
br i1 %narrow.not, label %func949680basic_block949684_22lowlift.exit, label %.critedge
.critedge: ; preds = %sleigh_remill_instruction_function_e7db8.exit
%4 = call i8 @sub_e732c__AvB_B_0()
br label %func949680basic_block949684_22lowlift.exit
func949680basic_block949684_22lowlift.exit: ; preds = %sleigh_remill_instruction_function_e7db8.exit, %.critedge
%5 = tail call i8 @func949680basic_block949700_23(ptr %stack, i32 %program_counter, ptr nonnull %memory, ptr nonnull %D12, ptr nonnull %D8, ptr %R1, ptr nonnull %D13, ptr nonnull %R9, ptr nonnull %D14, ptr nonnull %D15, ptr nonnull %D11, ptr nonnull %LR, ptr nonnull %D9, ptr nonnull %R10, ptr %R0, ptr nonnull %R8, ptr nonnull %R6, ptr nonnull %R5, ptr %R3, ptr nonnull %R4, ptr %R2, ptr nonnull %D10, ptr nonnull %R11, ptr nonnull %R7)
ret i8 %5
}
Hi, just saw this PR and gave it a quick spin:
Looks good so far, the double conditions I had are gone now too, codegen is surprisingly quite a bit different in how it merges the tails now, bit weird but still looks correct on a quick look.
Also I have been meaning to take a look at anvill, quick question here: does it work with the Sleigh based backends in remill? What are the benefits of using anvill?
My use case will be for emulator AOT recompilation for games, so I am not sure how much i will benefit from it
Anvill does use the sleigh remill backends. I dont think anvill is likely to be a good fit for your usecase because we cannot guarantee consistently correct recompilation inside of anvill. Anvill is trying to use brightening to produce simplified bitcode, similar to what you would see coming out of clang or the C written by a human. Since we are doing decompilation in anvill there are fundamental limitations that mean it cannot always succeed or recompile.
Anvill does use the sleigh remill backends. I dont think anvill is likely to be a good fit for your usecase because we cannot guarantee consistently correct recompilation inside of anvill. Anvill is trying to use brightening to produce simplified bitcode, similar to what you would see coming out of clang or the C written by a human. Since we are doing decompilation in anvill there are fundamental limitations that mean it cannot always succeed or recompile.
I see, thanks. Further down the line i plan to use this in some decompilation projects too but so far its on the backburner, might be interesting if it manages to yeet the state structs.
Btw minor thing, for PPC I use custom sleigh definitions and the current ghidra-fork and src handling (with the generated patches) chokes if new definitions (files) are introduced that dont exist in the original ghidra sourcetree
Any recommendation you have to work around this / want me to create a separate issue?
I'd like to avoid having to keep syncing changes across 2 ghidra codebases
Anvill does not use branch taken for intraprocedural flows and instead just switches on PC . Unfortunately we do rely on the btaken variable for a flow being lifted in the case of a conditional call ie. https://github.com/lifting-bits/anvill/blob/70209a8c3311cc97875605a137da210233fe9cd6/lib/Lifters/BasicBlockLifter.cpp#L232
Regardless since we compute these flow hints we should keep them consistent in pcode. We should test this against conditional control flow in anvill. Closes #694