NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.31k stars 5.84k forks source link

arm64: cannot parse arm64 windows ntoskrnl.exe function names properly #4901

Closed limulidae closed 1 year ago

limulidae commented 1 year ago

Describe the bug After load arm64 ntoskrnl (for example, 22621.1105) Many function names are not parsed properly. They are all named in the form of FUN_xxxxxxxx

To Reproduce Steps to reproduce the behavior:

  1. download windows 11 22621.1105 arm64 iso from the link
  2. extract install.wim from the iso
  3. extract ntoskrnl.exe from the install.wim
  4. load ntoskrnl.exe from ghidra

Expected behavior I can see the correct function name instead of FUN_xxxxxx

dev747368 commented 1 year ago

Could you run GetMSDownloadLinkScript (a ghidra script) on your ntoskrnl.exe and post the link here instead?

limulidae commented 1 year ago

ntoskrnl_22621.1105.arm64.zip

limulidae commented 1 year ago

@dev747368 , attach the screenshots.

function lists: FUN_xxxxxx

image

also, decompiler cannot decode its parameter type properly.

image
limulidae commented 1 year ago

Could you run GetMSDownloadLinkScript (a ghidra script) on your ntoskrnl.exe and post the link here instead?

Hi @dev747368 ,

I attach the exe file and screenshots. Are you able to reproduce in your side?

dev747368 commented 1 year ago

Since you are reporting an issue with a MS Windows binary, I would prefer getting the binary directly from MSFT's symbol server, which is what GetMSDownloadLinkScript would give you. You haven't yet shared the download link that it produced when you run the script and select the binary.

limulidae commented 1 year ago

Since you are reporting an issue with a MS Windows binary, I would prefer getting the binary directly from MSFT's symbol server, which is what GetMSDownloadLinkScript would give you. You haven't yet shared the download link that it produced when you run the script and select the binary.

Hi @dev747368

Got your point. Here is the link: https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/64C47E61103d000/ntoskrnl.exe

ghost commented 1 year ago

Hi @dev747368 any good news for this issue?

ghost commented 1 year ago

This is still happening in latest 10.2.3 build.

dev747368 commented 1 year ago

@santino-wang-mtk, I've looked at the binary. I'm a little unsure as to what problem you are having.

When opening the binary, I'm seeing about 7500 functions, and about 3000 are named & exported functions. The unnamed functions are discovered via IMAGE_RUNTIME_FUNCTION_ENTRY structs in the .pdata. If I analyze the binary, the number of (mostly unnamed) functions grows quite a bit. If I apply a pdb to the binary, the number of named functions is quite large (20k'ish).

What is it about this situation are you saying has a problem?

limulidae commented 1 year ago

@dev747368

Problem#1: function names are not parsed I can find the function in Windbg. I can find the function in IDA. But I cannot find the function in Ghidra.

for example, ntoskrnl!PpmIdleSelectStates

=== Windbg === 0:000> x ntoskrnl!ppmidleselectstates 00000001`404d0540 ntoskrnl!PpmIdleSelectStates (PpmIdleSelectStates)

=== Ghidra === test1

limulidae commented 1 year ago

@dev747368

Problem#2: decompiler doesn't recognize param_1; and param_1 is not used in decompiled source

FUN_1404d0540.txt

attached the decompiled C source.

Its prototype is: void FUN_1404d0540(undefined8 param_1,ulonglong param_2,undefined8 param_3,undefined8 param_4, char param_5,uint param_6,uint *param_7,uint **param_8)

the param_1 is not found in the decomplied source. It only exists in function declaration. Actually, param_1 is the most important parameter and widely used across this function. That's mean, it is a bug in the decompiler.

dev747368 commented 1 year ago

First things, can you verify that you are applying a pdb to your binary when doing analysis?

limulidae commented 1 year ago

@dev747368

Sorry I make a stupid mistake. My bad. Problem#1 is resolved after I manually load a pdb. I hide that comment for Problem#1.

But, Problem#2 is still not resolved. The 1st parameter is not used in the decompiled. ghidra10_2_3__nt_22621_1105_PpmIdleSelectStates.txt

from the decompiled C, it looks like lVar8 is acting like Prcb(aka param_1), but decompiled doesn't assign param_1 to lVar8.

param1

__security_push_cookie() doesn't change x0. Maybe this is the root cause?

dev747368 commented 1 year ago

I'm glad you got the pdb issue figured out. I'm going to close this ticket just so things don't get confusing.

I'm not sure if this second issue you have is a bug or just a normal RE issue that you will have to work through. If you can narrow the problem down to something specific and as small of an example as possible, it would probably be best to start a new ticket with just that one issue.