fonic / wcdatool

Watcom Disassembly Tool (wcdatool) - Tool to aid disassembling DOS applications created with the Watcom toolchain
69 stars 7 forks source link

Generated output in modules subfolder is corrupted/misaligned #16

Closed Anub1sR0cks closed 3 weeks ago

Anub1sR0cks commented 3 weeks ago

Hey there, thanks so much for developing this tool. It's been super helpful with reverse engineering an older game title that I've been trying to (slowly) port to modern platforms. However, I've always had an issue with the generated .cpp.asm output under the /modules directory. Nearly all the files are either missing information or containing code & subroutines/functions from other modules.

The game in question is Harvester which is late era 32-bit DOS title from the mid 90's. It appears to have been compiled & linked on Watcom 10.x w/ a DOS extender baked-in. The good news is that the executable contains debugging information with at least its original source file and function names intact, and maybe more (I'm not on expert on MZ layout). I'm not 100% sure why the output is mangled, but I've included the binary, log and output directory if you'd like to take a look: https://anubis.rocks/temp/HarvesterOutput.zip

fonic commented 3 weeks ago

Hi there, thanks for getting in touch. I'll gladly take a look. Always nice to discover more executables with debug info.

fonic commented 3 weeks ago

Hey there, thanks so much for developing this tool. It's been super helpful with reverse engineering an older game title that I've been trying to (slowly) port to modern platforms. However, I've always had an issue with the generated .cpp.asm output under the /modules directory. Nearly all the files are either missing information or containing code & subroutines/functions from other modules.

My bad. This is a regression that was introduced in v3.2 due to the new deduplication feature. I actually already fixed this back in 11/2023, but haven't gotten around to release a new version yet (I actually forgot about those changes):

## Changelog for v3.3 release

- fixed regression regarding *deduplication of consecutive data lines* (added in v3.2) messing up disassembly split into separate files (i.e. reconstructed source files)
[...]

##

_Last updated: 11/02/23_

For now, here's a quickly assembled development package for you containing the fixed version (and already generated output for HARVEST.EXE, which seems to be in perfect order): wcdatool-v3.3-devel (issue #16).zip (fixed in release v3.3)

(sorry for the weird nested compression, only 7-Zip was able to stay within GitHub's 25MB limit and only .zip files may be uploaded on GitHub)

fonic commented 3 weeks ago

Let me know if that version works for you.

Should you continue your analysis and also happen to create object hints to further refine the output, feel free to create a PR for the hint file. I'd be happy to include it in the repo and future releases (as an example for new users).

Anub1sR0cks commented 3 weeks ago

Thanks, it's looking a lot better now. If I produce any hints I'll make PR to upload it. Hopefully the executable's debugging info helps out the project too.

fonic commented 3 weeks ago

Thanks, it's looking a lot better now. If I produce any hints I'll make PR to upload it. Hopefully the executable's debugging info helps out the project too.

Great. In my experience it really does, as the debug info tremendously helps with navigating the sources and finding relevant parts (e.g. code related to asset loading/decoding).

It's also a lot of fun to just browse and look for interesting stuff, e.g. in harvest.cpp.asm, at label cheats:, you can see how they used keyboard scan codes to encode the cheat strings.

fonic commented 2 weeks ago

Fixed in release v3.3