Open maskelihileci opened 1 year ago
You'll need to manually disassemble those bytes by pressing d
on what you deem to be the next valid instruction start address because the static analysis cannot recognise those addresses as being used.
I'm aware of manual analysis, but I'm designing an extension, and for this extension, proper analysis of such functions should be done without the need for manual analysis.
I also don't understand why the analysis of such bytes is difficult when we know the start and end addresses. While there is a possibility for Ghidra to fix this, why is it imposing a static requirement on us users?
Similar situations exist in many places
It appears to be a computed switch stmt that is not recovering correctly, which Ghidra does try to do if all the pieces are in place for the analysis. Just disassembling the code, either by hand or automatically is not enough. The code needs to flow correctly to those locations. Ghidra is conservative in it's normal disassembly and tries to disassemble things that are reachable in the code. The switch recovery also ties the switch value with the destination case code.
There is most likely something wrong with the data flow that the decompiler, which is used for switch table recovery, can not follow. There can be many reasons for this. Not knowing anything about the binary you have, how it was loaded, etc.., I'll suggest a few things:
Can you post the decompilation for one of the functions with the unrecovered jump table?
The switch looks like it has an initial lookup value table, that then indexes into another table of offsets from a base pointer. This type of doubly indirect switch table calculations can cause trouble in data flow analysis. You can try making the memory locations of the table read-only, however the lookup could be more complex than the decompiler will allow.
To answer your question about why this is difficult when you know the start and end addresses. In general code can have embedded data, so you can't assume everything in the code section is actually code.
What about using Auto Analysis -> Aggressive Instruction Finding? Would that help?
The Aggressive Instruction Finding Analyzer could help automating the disassembly depending on how clean the binary code section is, and what @maskelihileci is attempting to do and how clean they need the resulting code flow. It can be slow and needs a rewrite, which is in progress for the ARM Aggressive Instruction Finder.
If the entire .text section is code with fillers, even just selecting the section and hitting 'D' can be useful. It all depends on the binary.
If decompilation of the functions is needed, then this type of non-recovering data flow is still an issue and getting to the root cause and trying to fix would be best.
@maskelihileci if the decompiled output for the function is not recovering the flow, you can submit the .xml output from the debug decompiler menu action.
It's best if I provide an example to help you understand me better. I will also include the addresses of the error locations.
0x14017110c = JMP RCX 0x140188b56 = JMP LAB
It's probably not such a sought-after thing; every 5 random applications likely yield this result.
I'm running into a similar issue. I'm looking at an x86-64 binary, and this looks like a compiler optimization that I've never run into before:
# RDI is a simple index value
# JUMPTABLE_LOCATION is a jumptable with dword-sized offsets relative to JUMPTABLE_LOCATION, which land at un-disassembled offsets in the function
# FUN_jump_RDX is just that - a single JMP RDX instruction, shared by a large amount of other functions
LEA RSI,[JUMPTABLE_LOCATION]
MOVSXD RDX,dword ptr [RSI + RDX*0x4]
ADD RDX,RSI
JMP FUN_jmp_RDX
There exists a FUN_jmp_*
for every general purpose register, so there is a lot of code in this particular binary that doesn't get properly disassembled into a switch statement. Some of these jump thunks get entered via a CALL
as well, and the target function just adjusts things accordingly. I can't come up with a great way to fix this in the general case. I can certainly patch the JMP
sites with their appropriate register-based instruction, but the CALL
sites will need some stack adjustment to decompile properly, I think. These thunks all initially disassemble as functions, so I guess something is prohibiting the decompiler from treating the thunk as part of the caller. I've yet to be able to trigger this optimization in a test binary, but if I can, I'll attach it.
I'm looking forward to the new Ghidra release; hopefully, we'll see this issue fixed.
Ghidra 11.0 version was tried, the problem is still not solved
I'm seeing a similar (related?) issue now with the kernel from Debian 12.5's release. I extracted the vmlinuz
compressed kernel from the .iso, then ran it through vmlinux-to-elf
and dropped it into Ghidra. Default analysis shows a ton (almost half?) functions call __fentry__
, which is marked as no return, it just jumps to __x86_return_thunk
. I've tried changing __fentry__
to remove no return attribute and set it as inline and re-analyzing, but this doesn't seem to have an effect.
Any suggestions for how to fix this? Do I need to make this this change prior to any analysis?
Update: Re-importing the binary, and then navigating to __fentry__
, disassembling, and marking as inline, as well as the __x86_return_thunk
, and then running auto-analyisis seems to correct the issue. So it seems like there's something in the default analysis that's hitting the __fentry__
call early on and marking it as a no return thunk, which then propagates that to every function that calls it (40,000+).
I'm seeing a similar (related?) issue now with the kernel from Debian 12.5's release. I extracted the
vmlinuz
compressed kernel from the .iso, then ran it throughvmlinux-to-elf
and dropped it into Ghidra. Default analysis shows a ton (almost half?) functions call__fentry__
, which is marked as no return, it just jumps to__x86_return_thunk
. I've tried changing__fentry__
to remove no return attribute and set it as inline and re-analyzing, but this doesn't seem to have an effect.Any suggestions for how to fix this? Do I need to make this this change prior to any analysis?
Update: Re-importing the binary, and then navigating to
__fentry__
, disassembling, and marking as inline, as well as the__x86_return_thunk
, and then running auto-analyisis seems to correct the issue. So it seems like there's something in the default analysis that's hitting the__fentry__
call early on and marking it as a no return thunk, which then propagates that to every function that calls it (40,000+).
I recommend you create a separate topic for this.
Describe the bug
As you can see in the image I provided, it's very strange that the opcodes below are not being analyzed
Expected behavior
The JMP RCX opcode specifies a conditional jump where the destination is decided in real-time. Naturally, not analyzing the bytes below, as you are doing, is a misconception, as it jumps to one of the opcode lines underneath that section.
Environment (please complete the following information):
Screenshots