Open Mc-muffin opened 1 year ago
I have just completed manually parsing the binary for Lego Harry Potter (5-7 years). I managed to recover the following sections:
.lib.ent.top
, .lib.ent
and .lib.ent.btm
.lib.stub.top
, .lib.stub
and .lib.stub.btm
.rodata.sceModuleInfo
.rodata.sceResident
.rodata.sceNid
Then I guessed the sections .shstrtab
(the e_shstrndx
value plus the sh_type
was matching) and .symtab
(it's the only section with the SYMTAB, the Allegrex plugin agrees with me here), but Ghidra is kind of choking on them. I guess the tool that was used to strip the names of the sections actually ZeroMemory-ed the section and placed a zero at sh_entsize
. It doesn't explain though why the address to .symtab
is also red.
Here's a screenshot: the entsize there is fine.
Ghidra 10.3.1 thinks that the address is not in the program memory. It even threw an uncaught exception at me 3 times.
(SwingExceptionHandler) Error: Uncaught Exception!
NullPointerException - Cannot invoke "ghidra.program.model.address.AddressRange.getMinAddress()" because
"rangeContaining" is null java.lang.NullPointerException: Cannot invoke "ghidra.program.model.address.AddressRange.getMinAddress()" because "rangeContaining" is null
I don't know what caused this, but I'm considering reporting the uncaught exception if it happens on the latest Ghidra too.
The section .sceStub.text
kind of doesn't exist as the syscall stubs are present in sections [SECTION7; SECTION26] - 20 sections in total. I suspect that's one section for each module that the game imports (there are 20 entries in .lib.stub
). I've verified my hypothesis on a few modules and it seems to be the case here. I personally don't know how to call these ELF sections.
There's also a funny side-effect of the name stripping... Ghidra shows... (!!) 1491 (!!) strings in the Strings View and all of them are empty! They are also useless as I can't jump to any of them (nOt In PrOgRaM mEmOrY).
Now then, regarding the actual issue we're discussing...
1) The Allegrex plugin should probably bundle the required structs together with Elf32_Ehdr
, Elf_ProgramHeaderType_Allegrex
and the others. I propose the following definitions (these are based on PPSSPP's source code):
struct PspModuleInfo {
ushort moduleAttrs; /* 0x0000 User Mode, 0x1000 Kernel Mode */
ushort moduleVersion;
char name[28]; /* 28 bytes of module name, packed with 0's */
void * gp; /* ptr to MIPS GOT data (global offset table) */
void * libent; /* ptr to .lib.ent section */
void * libentend; /* ptr to end of .lib.ent section */
void * libstub; /* ptr to .lib.stub section */
void * libstubend; /* ptr to end of .lib.stub section */
};
struct PspLibStubEntry {
char * name;
ushort version;
ushort flags;
byte size; /* The size of this struct in ints */
byte numVars; /* The number of imported variables */
ushort numFuncs; /* The number of imported functions */
void * nidData; /* The pointer to the nids in .rodata.sceNids*/
void * firstSymAddr; /* The pointer to the first stub function in .sceStub.text */
};
struct PspLibEntEntry {
char * name; /* May be NULL */
ushort version;
ushort flags;
byte size; /* The size of this struct in ints */
byte numVars; /* The number of exported variables */
ushort numFuncs; /* The number of exported functions */
void * sceResidentPtr; /* The pointer to the nids in .rodata.sceResident */
};
2) The script from https://github.com/pspdev/psp-ghidra-scripts doesn't work right now. I've examined it and I think I can rewrite it (in short, the author calls a method subtract
on a scalar thinking it's an address, but, to be honest, the script certainly requires a small refactoring no matter what). I also don't know if the NID database is accurate, does it know all the funcs PPSSPP recognizes and vice-versa?
3) In my opinion, the Allegrex plugin should eventually learn to resolve NIDs itself. I would love to see it even recognizing the exported functions and variables from .lib.ent
. If you're interested, have a look at this (navigate to "For PSP") and the source code for PPSSPP. Here's my take on this:
Of course, I had to define
struct SceModuleThreadParameter {
SceUInt32 numParams; /* The number of thread parameters */
SceUInt32 initPriority; /* The initial priority of the entry thread */
SceSize stackSize; /* The stack size of the entry thread */
SceUInt32 attr; /* The attributes of the entry thread */
};
Technically, we can include the typedefs for these types too, but I wouldn't mind seeing just uints there.
Ok, I've just checked Loco Roco 2. This is how the sections are called there:
I guess that solves the problem from before:
I personally don't know how to call these ELF sections.
Lego Harry Potter is a different case, I think it's good to keep it here but I'm just pointing that out. Lego Harry Potter has sections but they are nameless, this issue was created for games where only segments are present (so you'd have 3 or 4 segments total instead of a bunch of unnamed sections)
I also like the idea of having this plugin resolve NIDs, but one thing at a time :P the resolve nids script is currently broken but there's a PR that fixes it for the time being pspdev/psp-ghidra-scripts#15
but one thing at a time
I completely agree with you! This is why I believe we should probably start with the games where the sections were not merged, then move to the harder cases like Danganronpa. Thank you for mentioning the PR with the Nid resolver script fixed. I've tried it and everything works fine (other than the struct definitions, I don't like them). I guess I'll make a PR with updates once the current one is accepted.
A proper issue for the suggestion I filed in issue #28, I'll copy-paste for convenience:
"Would also be cool if we could recover the NID related sections (
.lib.ent
,.lib.stub
,.rodata.sceModuleInfo
,.rodata.sceResident
and.rodata.sceNid
) so we can use the NIDresolver script on these programs too.Recovering any other section (say
.rodata
,.ctors
,.dtors
,.eh_frame
etc) would be entirely optional"Now, I gathered some info that can be useful for this task, so bear with me: The easiest section to recover would be
.rodata.sceModuleInfo
, because it's address is set in segment's 0p_paddr
field and it always has this structure:With this info we can figure where are
.lib.ent
and.lib.stub
too, the fieldexports
points to the start.lib.ent
and the fieldexp_end
points to the end of it, similarly, the fieldimports
points to the start oflib.stub
andimp_end
points to the end of it.Strictly speaking, both
.lib.ent
andlib.stub
are surrounded by a small 4 byte section that delimits the top and bottom, these marker sections append a.top
or.btm
at the end of the respective parent sections, like so:but not sure if the top and btm section recovery is very worth.
Anyway,
.lib.ent
has the exports info andlib.stub
has the imports info, they are basically an array of the following structs:After those we are only missing
.rodata.sceResident
and.rodata.sceNid
, which would need some parsing to get: For.rodata.sceNid
we need to parse.lib.stub
, if we sumvar_count
andfunc_count
for each element of.lib.stub
get the size (inint
s) of.rodata.sceNid
and the start of it it's the lowest value of the fieldnids
Lastly,
.rodata.sceResident
, this one has 2 parts:uint
followed by a null terminatedstring
/char
array (with padding zeros so it's 4byte aligned), the array item count is the same item count as the array for.rodata.sceNid
.lib.ent
, summingvar_count
andfunc_count
multiplying by 2 we get the size (inint
s) for this part the start of.rodata.sceResident
would be the lowest address in thename
field of.lib.stub
minus 4Anyway, some other sections can probably be figured out too (like
.sceStub.text
) but I guess these are the more useful ones, sorry for the long issue text :P I must also say that in the case of Danganronpa 2 the segments were kinda in order, but I guess we can't trust that to be case for every game.