Not analyzing the bytes below JMP RCX is a problem

maskelihileci commented 1 year ago

Describe the bug

As you can see in the image I provided, it's very strange that the opcodes below are not being analyzed

Expected behavior

The JMP RCX opcode specifies a conditional jump where the destination is decided in real-time. Naturally, not analyzing the bytes below, as you are doing, is a misconception, as it jumps to one of the opcode lines underneath that section.

Environment (please complete the following information):

OS: Windows 10
Java Version: 17.0.5
Ghidra Version: 10.3.3
Ghidra Origin: Official GitHub

Screenshots

Wall-AF commented 1 year ago

You'll need to manually disassemble those bytes by pressing d on what you deem to be the next valid instruction start address because the static analysis cannot recognise those addresses as being used.

maskelihileci commented 1 year ago

I'm aware of manual analysis, but I'm designing an extension, and for this extension, proper analysis of such functions should be done without the need for manual analysis.

I also don't understand why the analysis of such bytes is difficult when we know the start and end addresses. While there is a possibility for Ghidra to fix this, why is it imposing a static requirement on us users?

Similar situations exist in many places

emteere commented 1 year ago

It appears to be a computed switch stmt that is not recovering correctly, which Ghidra does try to do if all the pieces are in place for the analysis. Just disassembling the code, either by hand or automatically is not enough. The code needs to flow correctly to those locations. Ghidra is conservative in it's normal disassembly and tries to disassemble things that are reachable in the code. The switch recovery also ties the switch value with the destination case code.

There is most likely something wrong with the data flow that the decompiler, which is used for switch table recovery, can not follow. There can be many reasons for this. Not knowing anything about the binary you have, how it was loaded, etc.., I'll suggest a few things:

Looking at the code, it has a mix of 32bit register ops and 64bit register ops. Is this a mixed mode program or a 32-bit program loaded as a 64-bit program. This can cause the upper part of the bigger registers bigger registers to have unknown values.
Could the program have been loaded in a wrong location, such that the switch table of values is not accessible
Is the memory for the switch table marked readonly? Sometimes the decompiler can't trust the values at a location
Could there be a non-returning function somewhere in the function that is messing up the stack or data flow?
Is there a function called that could have side-effects on the stack, such ask __chkstk, which should automatically get a call-fixup of alloc-probe for 64-bit windows programs.

Can you post the decompilation for one of the functions with the unrecovered jump table?

The switch looks like it has an initial lookup value table, that then indexes into another table of offsets from a base pointer. This type of doubly indirect switch table calculations can cause trouble in data flow analysis. You can try making the memory locations of the table read-only, however the lookup could be more complex than the decompiler will allow.

To answer your question about why this is difficult when you know the start and end addresses. In general code can have embedded data, so you can't assume everything in the code section is actually code.

pjsoberoi commented 1 year ago

What about using Auto Analysis -> Aggressive Instruction Finding? Would that help?

emteere commented 1 year ago

The Aggressive Instruction Finding Analyzer could help automating the disassembly depending on how clean the binary code section is, and what @maskelihileci is attempting to do and how clean they need the resulting code flow. It can be slow and needs a rewrite, which is in progress for the ARM Aggressive Instruction Finder.

If the entire .text section is code with fillers, even just selecting the section and hitting 'D' can be useful. It all depends on the binary.

If decompilation of the functions is needed, then this type of non-recovering data flow is still an issue and getting to the root cause and trying to fix would be best.

@maskelihileci if the decompiled output for the function is not recovering the flow, you can submit the .xml output from the debug decompiler menu action.

maskelihileci commented 1 year ago

It's best if I provide an example to help you understand me better. I will also include the addresses of the error locations.

0x14017110c = JMP RCX 0x140188b56 = JMP LAB

It's probably not such a sought-after thing; every 5 random applications likely yield this result.

x64.zip

bdemick commented 1 year ago

I'm running into a similar issue. I'm looking at an x86-64 binary, and this looks like a compiler optimization that I've never run into before:

# RDI is a simple index value
# JUMPTABLE_LOCATION is a jumptable with dword-sized offsets relative to JUMPTABLE_LOCATION, which land at un-disassembled offsets in the function
# FUN_jump_RDX is just that - a single JMP RDX instruction, shared by a large amount of other functions
LEA        RSI,[JUMPTABLE_LOCATION]
MOVSXD     RDX,dword ptr [RSI + RDX*0x4]
ADD        RDX,RSI
JMP        FUN_jmp_RDX

There exists a FUN_jmp_* for every general purpose register, so there is a lot of code in this particular binary that doesn't get properly disassembled into a switch statement. Some of these jump thunks get entered via a CALL as well, and the target function just adjusts things accordingly. I can't come up with a great way to fix this in the general case. I can certainly patch the JMP sites with their appropriate register-based instruction, but the CALL sites will need some stack adjustment to decompile properly, I think. These thunks all initially disassemble as functions, so I guess something is prohibiting the decompiler from treating the thunk as part of the caller. I've yet to be able to trigger this optimization in a test binary, but if I can, I'll attach it.

maskelihileci commented 11 months ago

I'm looking forward to the new Ghidra release; hopefully, we'll see this issue fixed.

maskelihileci commented 11 months ago

Ghidra 11.0 version was tried, the problem is still not solved

bdemick commented 8 months ago

I'm seeing a similar (related?) issue now with the kernel from Debian 12.5's release. I extracted the vmlinuz compressed kernel from the .iso, then ran it through vmlinux-to-elf and dropped it into Ghidra. Default analysis shows a ton (almost half?) functions call __fentry__, which is marked as no return, it just jumps to __x86_return_thunk. I've tried changing __fentry__ to remove no return attribute and set it as inline and re-analyzing, but this doesn't seem to have an effect.

Any suggestions for how to fix this? Do I need to make this this change prior to any analysis?

Update: Re-importing the binary, and then navigating to __fentry__, disassembling, and marking as inline, as well as the __x86_return_thunk, and then running auto-analyisis seems to correct the issue. So it seems like there's something in the default analysis that's hitting the __fentry__ call early on and marking it as a no return thunk, which then propagates that to every function that calls it (40,000+).

maskelihileci commented 8 months ago

I'm seeing a similar (related?) issue now with the kernel from Debian 12.5's release. I extracted the vmlinuz compressed kernel from the .iso, then ran it through vmlinux-to-elf and dropped it into Ghidra. Default analysis shows a ton (almost half?) functions call __fentry__, which is marked as no return, it just jumps to __x86_return_thunk. I've tried changing __fentry__ to remove no return attribute and set it as inline and re-analyzing, but this doesn't seem to have an effect.

Any suggestions for how to fix this? Do I need to make this this change prior to any analysis?

Update: Re-importing the binary, and then navigating to __fentry__, disassembling, and marking as inline, as well as the __x86_return_thunk, and then running auto-analyisis seems to correct the issue. So it seems like there's something in the default analysis that's hitting the __fentry__ call early on and marking it as a no return thunk, which then propagates that to every function that calls it (40,000+).

I recommend you create a separate topic for this.

NationalSecurityAgency / ghidra

Not analyzing the bytes below JMP RCX is a problem #5843