ethteck / splat

A binary splitting tool to assist with decompilation and modding projects
MIT License
170 stars 43 forks source link

Disassemble TU splits into a single file rather than a file for each section #414

Open 1superchip opened 2 weeks ago

1superchip commented 2 weeks ago

When splat disassembles splits, it disassembles each section of an object into a different file rather than a single file. This can be an issue when linkage of objects is known and there are local symbols within 1 section of a TU that are referenced from another section.

Example:

# tu0.text.s
.section .text

# Local function in tu0
fn myLocal, local
...
endfn myLocal
# tu0.bss.s
.section .bss

# Reference to local function in tu0 
# which will throw an undefined reference to symbol myLocal as it isn't defined in the same file
obj funcPointer, global
.4byte myLocal
endobj funcPointer

When there are files for each section of a TU, symbol visibility issues can occur due to local symbols not being visible outside of the file they are defined in.

Disassembling a TU into a single file would look like this:

# tu0.s
.section .text

# Local function in tu0
fn myLocal, local
...
endfn myLocal

.section .bss

# Reference to local function in tu0
obj funcPointer, global
.4byte myLocal
endobj funcPointer
AngheloAlf commented 2 days ago

How do we want to expose this functionality to the user? Maybe add a new segment for TUs?

So if we currently have something like this:

# Each section is written to its own file
- [0x1000, asm, file_a]
- [0x1100, asm, file_b]
- [0x1400, data, file_a]
- [0x1800, data, file_b]
- [0x1B00, rodata, file_a]

Then with this new segment it would look like this:

# text, data and rodata will be migrated to `file_a` and `file_b` respectively.
- [0x1000, tuasm, file_a] # or `asmtu`, `tu`, etc
- [0x1100, tuasm, file_b]
- [0x1400, .data, file_a] # Note the `.data` instead of `data`
- [0x1800, .data, file_b]
- [0x1B00, .rodata, file_a]

I think this would match to how c and cpp segments work.

What do you think?