kotcrab / ghidra-allegrex

Ghidra processor module adding support for the Allegrex CPU (PSP)
Apache License 2.0
90 stars 9 forks source link

Binary with debug symbols is oddly imported #36

Closed Nemoumbra closed 1 month ago

Nemoumbra commented 10 months ago

The game "Yu-Gi-Oh! Duel Monsters GX: Tag Force" (goes by ULJM-05151) was NOT stripped of the debug info before release (quite unusual, if you ask me).

I tried loading this in Ghidra, but it just didn't go well. The symbols were parsed correctly and I can see them in the functions view, but the code is just.... gone. It's the .text section, why is there data defined? Something's extremely odd with that.

image

The section that contains NIDs is totally busted... image

And one more, for good measure... image

Question... Who is here to blame, the Ghidra ELF loader or the plugin? Maybe it's worth reporting to the main app devs?

Nemoumbra commented 10 months ago

I've found another game with debug symbols for testing: Puzzle Bobble Pocket JP. Ghidra's misbehaving there too. And I've also noticed there are these error bookmarks:

image

kotcrab commented 10 months ago

I don't think this is plugin fault, the same happens when importing as normal MIPS. This can be fixed pretty easily by clearing the auto defined arrays before doing the analysis, the biggest one being at 0883a898. See #9,

Nemoumbra commented 10 months ago

As you can see on the third screenshot, there are multiple purple labels at the address 08804000. The fact that there's mentions of .lib.stub and .lib.ent makes me think something's wrong with the way Ghidra applied the debug info. This can't be fixed by a simple "Ctrl+A, C (clear)" in the .text, can it? As a result, Ghidra decided to name the function there as sceKernelRegisterSubIntrHandler. That's complete nonsense, and the only way to fix that would be to rely on PPSSPP's sym files. I don't trust them too much though, because they tend to incorrectly compute the function bodies.

So the question is... what should be fixed for the labels to be created where they're needed? I'm ready to report that to the Ghidra devs if we can come up with an explanation of what's going on.

kotcrab commented 10 months ago

Might be also just that those debug symbols are weird, given how the same happens when this file is imported as MIPS. It's either that or some Ghidra issue. What exactly is broken needs deeper investigation, I won't be looking into that as nothing suggests this is plugin issue.

Nemoumbra commented 10 months ago

The debug symbols being wrong is unlikely: IDA managed to load the files and apply the function names and labels correctly. This is likely a Ghidra issue, so I'm reporting this to the main repo.

Let's keep the issue open here for now.

Nemoumbra commented 6 months ago

I've got an update on the situation... Small recap: were 2 issues: the data in the code sections and the labels. The related commits by ghidra1 have not been released to the public yet (milestone == 11.1).

I pulled the latest upstream for Ghidra (which was this at the time) to test if it works and I finally managed to build your plugin! This is what I uncovered:

1) Ghidra stopped covering code with undefined arrays. Not just that, it's got an option now that can fully disable this feature. 2) The Allegrex plugin still suffers from misplacing the labels. 3) The MIPS loader properly places the labels for the debug info. Too bad we can't use it due to mismatching instruction sets + it only works if we don't rebase the image.

I think now it's no longer the base Ghidra's problem.

Nemoumbra commented 4 months ago

New info: Ghidra finally updated to 11.1. The relevant commits are included there so we can revisit this problem. Just a small note: ghidra1 asked you

Can you inspect an etype==0xffa0 binary which you believe is not relocatable and check the ElfSymbol values if they are relative or absolute.

and I didn't see you answer. I think they hardcoded the 0xffa0 value in the code (see this). If there are non-relocated binaries with 0xffa0, this may end up being slightly troublesome.

kotcrab commented 4 months ago

Do you know of any game which is not relocatable and has debug symbols? I'm not sure but I guess their question is only relevant in that context so I don't know the answer to it.

Nemoumbra commented 4 months ago

Yes, in fact, I do! But you know what's the trick? These games don't have the 0xffa0 for e_type. To be honest, all games built with the Metrowerks Codewarrior (MW MIPS C Compiler) seemingly are not relocated in this manner and so Ghidra always identifies the base address as 0x08804000 without me having to enter it manually (which I'm forced to do for 0xffa0 games).

Right, the debug symbols AND not relocated... Aces of War (EU). But I think they asked for a game with 0xffa0 that is not relocated.

kotcrab commented 4 months ago

I checked my games, the non relocatable have e_type=0x2.

But I think they asked for a game with 0xffa0 that is not relocated.

Yep. Seems like their assumption is okay.

Ghidra always identifies the base address as 0x08804000

Yes, this is taken from the ELF's program header.

Nemoumbra commented 2 months ago

I've found some free time to return to our problem. Our current upstream version for Ghidra is 11.1.2, but they have not added anything particularly relevant to the case since my last comment. I've conducted a few tests with my Ghidra 11.1.1 & Allegrex 11.1 and I'm ready to reveal the results. I used two games with debug symbols: Yu-Gi-Oh! Duel Monsters GX: Tag Force represented the relocated games (e_type == 0xffa0) and Aces of War represented the non-relocated games (e_type == 0x2).


Yu-Gi-Oh! Duel Monsters GX: Tag Force:

1) Allegrex, apply undefined symbol data: bad labels, undefined data over code. :x: 2) Allegrex, don't apply undefined symbol data: bad labels, no data over code, but also no data over the data symbols. :x:

The image base doesn't matter here.

Conclusion

The Allegrex plugin incorrectly handles the labels which undermines the whole idea of using the debug symbols as the further analysis is impossible unless we clear all the labels.

3) MIPS, base = 0x08804000: the labels are correct, no undefined data over code, but Ghidra is unable to process the relocations => it's all SUB_0000910c and func_0x0000910c in the code. :x: 4) MIPS, base = 0x0: the labels are correct, no undefined data over code, the calls work fine, but Ghidra doesn't understand the Allegrex instruction set + Ghidra mixes up the normal integer values and the pointers. :x:

Switching apply undefined symbol data on/off seemingly didn't affect the results so I'm starting to think it's broken for MIPS here.

Conclusion

The labels are placed like they should be, but there's a reason why we need a separate plugin for the Allegrex architecture. We can't use the default MIPS.


Aces of War (EU): 1) Allegrex, base = 0x08804000, apply undefined symbol data: good lables, the data is not placed over the code, but it's placed over the actual data. ✔️ 2) Allegrex, base = 0x08804000, don't apply undefined symbol data: same as before, but the data is not placed over the data (just as we asked!). ✔️

Trying the base 0x0 or the MIPS loader is not required here.

Conclusion

The binaries that are not relocated are fully supported by the Allegrex plugin.

kotcrab commented 1 month ago

@Nemoumbra I synced Ghidra changes, can you retest with build from https://github.com/kotcrab/ghidra-allegrex/actions/runs/10644246173?

kotcrab commented 1 month ago

Actually better if you can test with https://github.com/kotcrab/ghidra-allegrex/actions/runs/10644441954 as I cleaned some old hacky stuff

Nemoumbra commented 1 month ago

Actually better if you can test with https://github.com/kotcrab/ghidra-allegrex/actions/runs/10644441954 as I cleaned some old hacky stuff

Yu-Gi-Oh! Duel Monsters GX: Tag Force:

  1. Allegrex, apply undefined symbol data: good labels, the data is not placed over the code, but it's placed over the actual data. ✔️
  2. Allegrex, don't apply undefined symbol data: same as before, but the data is not placed over the data (just as we asked!) ✔️

Trying the base 0x0 or the MIPS loader is not required here.


Aces of War (EU):

No regression detected

Conclusion

The latest commit fully fixes the issue.

kotcrab commented 1 month ago

Great, thanks for testing.