klei1984 / max

M.A.X. Port
https://klei1984.github.io/max/
MIT License
51 stars 4 forks source link

[Appreciation] Amazing job! #8

Closed idubrov closed 2 years ago

idubrov commented 2 years ago

Hi!

First of all, amazing work! Really exciting project, I love this game, used to play it a lot (and in case your github username is your birth date -- love for this game might be a generation thing lol!).

As a matter of fact, I started very similar project just a few days ago. Same game. Seems like exactly the same binary. Good thing I found your project so I don't do duplicate work (well, I guess, now I need another game to reverse engineer).

This document is truly a gold mine. In my one week attempt I used rather crude approach where I would open binary with Ghidra and just try to tag as many things as possible manually.

Analyzing through functions like __wcpp_2_dtor_array_store__ was as exciting as you can guess (I was too lazy or too dumb to look for the matching compiler version lol). C pseudocode given by Ghidra helps, though. Spent like a whole yesterday on it 😅

(I'm still kind of proud I was able to decipher logic behind __wcpp_2_dtor_array_store__ just from looking at the assembly / decompiled code; I called it vec_deallocate, but hey, close enough!).

Well, anyways, good luck, and, again, amazing work you've done so far!

klei1984 commented 2 years ago

Hello @idubrov , thanks for your kind words, I really appreciate it :blush:

I intend to continue that document, but currently I spend most of my time coding. I learnt a lot of Watcom C++ specific code patterns that I still want to document in that article. I actually set up an MS-DOS development environment with the original compiler to try out various C++ constructs and study how their disassembly looks like. One particular pattern I would highlight is the RTTI data layout and virtual tables in general. I am sure it would be useful for anyone who wants to work with MS-DOS games that were written in Watcom C++.

A couple months ago when I messed around with M.A.X. 2 I accidentally found out that its MS Visual C/C++ compiler emitted RTTI data including class names. Amazingly M.A.X. 2 reused more than 112 classes from M.A.X. 1. The entire AI, building management and most GUI related stuff are copy and paste from the first game which gave a big boost to my work. Another good source of information was Fallout 2's official Mapper tool which was released with full debug information included (by accident?). Yet another useful source was Fallout 1 & 2 themselves as their Macintosh releases embedded full symbol tables for every function :laughing:

M.A.X. 1 & 2 and Fallout 1 & 2 use the same game engine, called GNW. So basically the work I did here is fully reusable for those other games as well. Interestingly Fallout 1 & 2 did not use any C++.

One interesting observation I made is that, at least for Watcom’s old compiler, reversing C++ is easier on assembly level than on pseudo code. The reason I say this is that C++ does a lot of magic in the background which is part of the language and should not be reproduced in the rewritten C++ code itself.

For a basic example the C++ standard says that it is fine to call the delete operator on a Null object without a Null pointer check. How does this work in practice? The Watcom compiler simply wraps all delete operator calls with an implicit Null pointer check. So what happens if the code author adds an explicit check as well out of paranoia or ignorance? There will be two Null checks and both will be reproduced in the pseudo code as well. Constructor and destructor calls, returning classes by value and similar constructs are just super confusing on pseudo code level while on assembly level the patterns can be identified with high clarity.

If you go for another MS-DOS game, then I recommend using a similar approach to what I did for M.A.X. The ability to always have a "functional" version of the game, to be able to rewrite stuff incrementally and integrate the changes right away is incredibly useful. It's rather difficult when the game uses C++ though as we cannot control the space allocated for C++ objects and this could break the ABI quite fast... so another recommendation I could give is to select a game that was not written in C++.

Actually when I started this project I did not know that most of the game was written in C++ :neutral_face: The old Watcom compiler emits a ridiculous amount of helper and wrapper functions for C++. M.A.X. 1 has 5704 functions and like one third of them are just auto generated stuff to keep up with the language characteristics required by class templates. Most C++ classes are deeply intertwined too so now I need to follow a big bang process where I cannot incrementally integrate stuff back into the game. First I need to implement a lot of dependent stuff and then integrate all of it back only when most of it is ready.

I wish you good luck with your project too and thanks!