samunders-core / le_disasm

libopcodes-based (AT&T syntax) linear executable (MZ/LE/LX DOS EXEs) disassembler modified from http://swars.vexillium.org/files/swdisasm-1.0.tar.bz2
GNU General Public License v3.0
19 stars 5 forks source link

Help with partial disassemble. #1

Closed Roflincopter closed 3 years ago

Roflincopter commented 5 years ago

Hi, I'm using this LE disassembler to try and disassemble a DOS/4GW based project. I unbound the LE executable and ran it trough the disassembler, but I ended up with a couple of warnings which lead to undefined references in the assembly code.

The main one being,

Warning: Tried to trace code at an unmapped address: 0x00000f

but similar warnings follow. I was wondering if that happens with more projects you tried to disassemble and what you did to resolve them.

Could it be that DOS/4GW maps a function in lower memory? Or is this disassembled wrong?

In other cases the unmapped address is way outside the binary so this would suggest some error in the disassembly.

samunders-core commented 5 years ago

Hello. Honestly, I don't know. Most likely it's a bug, I consider myself lucky all occurrences I came upon ended up in functions I could skip over as mine binary had debugging info present.

klei1984 commented 5 years ago

Hello @Roflincopter , I have a mixed 16-bit/32-bit unbound linear executable in which the 16-bit code segment uses short near offset addressing for call, jmp and similar instructions. I am not entirely sure, but it seems to me that le_disasm tries to disassemble such 16-bit segments in 32-bit mode which would simply be wrong. In PR #10 there is a command line option to dump the flat loaded executable image of the unbound LE executable into a binary file that could be loaded by IDA 7.0 Freeware in binary mode. The region information that le_disasm prints out to std error output allows to manually create the segments in IDA where you can try to set the segment type that contains your offending references to be 16-bit. The flat image disassembled by the Freeware version of IDA uses the same virtual addresses that le_disasm uses in function names, labels and other log messages. This may help you identify whether le_disasm disassembles 16-bit code as 32-bit.

Recently I also had issues with misinterpreted displacement constants leading to undefined references to invalid CASE labels. For me Pull request #9 resolved these issues.

I also use wxHexEditor that uses the same disassembler library that le_disasm uses. This way I could compare the disassembly listing of le_disasm to a reference. le_disasm post processes certain opcodes while applies hacks to another bunch. Try to find and analyze the offending opcode via debugging.

What is your game / program that you try to analyze?

fonic commented 4 years ago

Hello. Honestly, I don't know. Most likely it's a bug, I consider myself lucky all occurrences I came upon ended up in functions I could skip over as mine binary had debugging info present.

Would you mind telling which binary you are working on? I'm looking for linear executables with debug symbols for a project of mine.

Roflincopter commented 3 years ago

My apologies for the very long radio silence I had lost interest, but decided today for some reason to check this out once more.

I can say the current version (8052a7e1) does not have the issue described. I can bisect the commits if you are interested in what change fixed it for me. It was indeed a mixed 16/32 bit binary. I'm a bit hesitant to disclose which binary I'm disassembling because of copyright issues with regards to reverse engineering.

the disassembly now spits out one "fdisi(8087 only)" line, which errors out in the assembler but is easily manually fixed. And if I understand correctly this is due to a library and not really under the control of le_disasm without using some hack.