RFC: Use Frida for the future of EEex

suy commented 3 years ago

Hello.

I'm Alex, "suy" or "disperso" on internet and the like, and I have a proposal and request for comments to make about the future of EEex to anyone interested, specially now that the 2.6 release of the Infinity Engine games is known to have changed to 64 bits builds on Windows.

Some of the explanations are probably a bit redundant for Bubb, but I wrote them anyway in case I got something wrong that needs to be corrected, or in case someone wants to join the project and wants some context. I've seen him online on some discord servers, and we've talked a bit about things here and there.

But first, let me thank Bubb and other contributors to EEex, and congratulate you for your incredible work! I'm a software developer, and used as I am to have the source of what I need to modify, I could not imagine that so much (and so quickly) could be accomplished with reverse engineering alone. The fact that EEex enables or implements so many features that I've wanted as a player, married with me being unable to easily have them (as I rarely use a Windows computer) made me frustrated, but motivated enough to learn about reverse engineering. It took me a while, but I've learnt a few things in the last year or two, and I'm thankful for what I've learnt as well! It's even gonna be useful as part of my day to day job.

Now no more digression, let's propose.

2.6 needs an EEex rewrite

2.6 releases change the engine to be a 64 bit only version. Linux had 32 and 64 bit builds, but now it's only 64. Windows switched a 32 bit build with a 64 bit one. While I don't know enough about EEex, from what we've talked about I understand that it needs almost a full rewrite. The matches to locate the functions are not useful, and the code that extends them also has to be entirely different.

Enter Frida

My research into finding ways to do what you accomplished with EEex on Linux, (and hopefully in a cross-platform way so I would not have to work alone) ended up with me discovering some frameworks, but the obvious winner, hands down, has been Frida.

"It’s Greasemonkey for native apps" they say in the website, and they probably are very well right.

It can attach to a running process, or inject itself at startup and defer the start of the main thread till it has installed the hooks and instrumentation that one decides. I think the latter is the EEex approach. The project is very, very portable, and works on Windows, Mac, Linux, and even phones. Imagine EEex on Android and iOS? I think it should be possible. The 32/64 bits transition is not too relevant as the code to write is portable across OS and CPU. The size of pointers is an issue, of course. That's 4 bytes in 32 bits an 8 in 64 bits CPUs.

The core of the project is written in C, using some higher level library like glib and other dependencies. However, and while it's possible to use only C, the usual approach is to let Frida inject a JavaScript engine into the process to script the hooking, instrumentation, etc.

The default JS engine is QuickJS, a small JS engine that it's still quite featureful and fast. It used to default to V8 which is even faster, but given that it uses JIT compilation, that's an issue in some mobile/embedded architectures. It's still an option if more speed is needed as it won't matter on a desktop/laptop computer.

So, it's JavaScript needed? Would EEex using Frida be written in JS? Well, not necessarily, or at least not that much. As I understand it, the recommended approach is to write a minimum amount of JavaScript that just performs the first step of the low level instrumentation. It could intercepts calls (to add logging, for example) or replace values of arguments and/or return values. But Frida stablishes a bidirectional channel between the JS engine in the application being instrumented and a second process which would be spawning the game (or attaching to it if already running). So the idea seems that one choses a binding in one of the languages supported by frida-core. That binding can launch the game and do the heavy lifting of the logic. It can communicate with the script to tell it what to do, when needed, or if needed at all.

The examples that one often sees for learning to use Frida are written in Python which then injects a bit of JavaScript, but it's not the only option. I think it's probably a good option for a developer to start doing something, and maybe to share with other developers, but not necessarily to distribute that to non-developers to use unmodified. Frida has a bunch of helper programs, and those are written in Python, that's why it's a common "entry point" when starting learn how to use Frida.

JavaScript only examples

Let me show some simple scripts that I've done this months. Some might be familiar. I launch the game via frida -f executable --no-pause -l script.js.

const volumePointer = Module.findExportByName(null, "_ZN11CSoundMixer15SetGlobalVolumeEl")
Interceptor.attach(volumePointer, {
    onEnter: function(args) {
        args[1] = ptr("0x64");
    },
});

This is valuable as it is for me! I wanted to have the volume keep constant when I unfocus the window sometimes (e.g. recording the game), so this just replaces the argument to be always 100.

No need to have any external thing communicating with it. If we wanted to allow the user to configure it to sometimes do something, and sometimes do not, we could use a setTimeout call to read periodically a configuration file.

Another example:

const versionStringPointer = Module.findExportByName(null, "_ZN7CChitin16GetVersionStringEv")
// const versionString = new NativeFunction(versionStringPointer, 'pointer', []);
Interceptor.attach(versionStringPointer, {
    onLeave(retval) {
        console.log("[*] Intercepted by Frida");
        console.log("[*] retval ...=", retval.readPointer().readCString());
        let heap = retval.readPointer();
        heap.writeUtf8String("v2.5 Injected by Frida :)");
    }
});

Similar to the previous one, but this time it just writes to the heap-allocated string. This was the first successful hook that I've wrote.

I have a much fancier script which I won't paste inline here, but has a more structured approach to reading data. It creates a combat log by intercepting the calls to CMessageHandler::AddMessage. I still don't understand everything about the CMessage types, but I did some simple tricks to find out the type and create objects in JS that represent the same type as in the engine's C++.

Here it is: https://gitlab.com/moebiusproject/proprietary-binary-modification/-/blob/0f3aee0c3aec77d20eb4580df87af5d5b1a62c95/frida/intercept-messagehandler-addmessage.js

All of this script are Linux only, but it will be trivial to make the work on Windows if we require the debug symbols to be around, as Frida can resolve functions that way as well. Or we could supply an address list or a matching mechanism like EEex does so far. I'll make those work on Windows ASAP, as I want to ask users to try the scripts to see if they find the approach of installing Python on Windows not a big deal.

Sprinkling C++

But I'm not sure if doing that much in the JavaScript side as in the previous script is the suggested way to do it by Frida developers. One simple way to move the logic to some other language is to replace functions in the native code with some other native code.

For example, the random number generator used in the engine is very poor. There is a famous talk calling that practice as harmful. I don't think it's a big deal for a human player, but hey, let's be pedantic, and use a modern PRNG. Write this C++:

#include <random>

extern "C" long BetterRandInt(long value, long luck)
{
    static std::mt19937 engine = [](){
        std::random_device device;
        std::seed_seq seed{device(), device(), device(), device(),
                    device(), device(), device(), device()};
        return std::mt19937(seed);
    }();

    std::uniform_int_distribution<long> distribution(0, value-1);
    long result = distribution(engine);
    result += luck;
    if (result >= value)
        result = value - 1;
    return result;
}

Compile it to a library, then we can do this:

// Resolve the original address.
var originalPointer = Module.findExportByName(null, "_ZN5CUtil11UtilRandIntEll");
// If not loaded already, inject the library with our hooks in native code.
var eengex = Module.load("/path/to/libeengex.so.1.0.0");
// Find the address of the hook and swap implementations.
var replacementPointer = eengex.findExportByName("BetterRandInt");
Interceptor.replace(originalPointer, replacementPointer);

This replaced function is simple enough (only long as parameters or return value) so it could be implemented in C, C++ or even Rust. For fancier functions with C++ objects I think we should do it C++. I've not done anything fancy yet, but I think it would be the obvious way to go.

TobEx/TobExEE probably operates similarly, as I see definitions like this: https://github.com/Ascension64/TobEx/blob/master/TobExEE/src/lib/infgame.h

Maybe we can grab some definitions from GemRB as well. I started making my own parsers for some Infinity Engine file structures, but that code is very incomplete (though it's thoroughly unit tested).

So all or most in C++ then?

I certainly would like to, but I've not wrapped my head yet on how to do it, with respect the Frida-specific parts. C++ is what I've used at work for many years, so I can help with that greatly. It's been a while since the last project in which I had to make an installer for Windows, or something like that, but I don't see that EEex so far has had too much issues with one: it seems that it just ships a library the same way that other mods distribute a setup-foo.exe. The user puts it on the directory, and it's fine.

Writing the hooks and the replacing logic in C++ would be easy once we know how it works and what we want to do, as I've shown above.

The part that I don't get so well is how to use the Frida APIs in C++. I've linked above some examples in C. But there is no real documentation that I've found. I have no trouble working with C, but this kind of APIs are not that simple, and often are needed transactions, locks and unlocks, allocations and deletions, and I got chills of doing so much manual code without docs. There are 3 Frida-using projects which use that API in C++, so one at least can see working abstractions which are sort of comparable to having documentation. Those are:

Mocxx. It's only for replacing functions, so it would not be complete. But it has a nice API. It's a single header and it's easy to read if you skip the metaprogramming magic done to do things at compile time. The use of the frida_* functions is what you need to look at.
gumpp. This is a bit of C++ code wrapping the "frida gum" code, which is the magic in C/ASM to hook functions and the like. I've not looked much at this.
frida-qml. This is an official Frida project written by Frida's author, so it's supposed to be as good as it gets. It's one of the "bindings" mentioned in the docs that frida-core interacts with (the others being Python, NodeJS, C#, Swift...).

One good thing of the last one: it's written using Qt. Qt is a C++ framework which helps a lot in many things. My job the last decade has been more being a Qt developer than a C++ developer, so it's something that I'm also fairly familiar. It's good stuff.

Now the bad part: it's "forcing" on us a special Qt library that provides an engine for a language called QML which so far is heavily focused on UIs. That could be a good thing: I could make a very simple UI that launches the game and acts as a control panel for it. It could probably interact with the script while playing, so you could change settings live via the UI. But so far, it's bad because it also hides the C++ API. I'm fiddling with it, and I think we can modify the project to expose the C++ API as well and do just a non-GUI app if we want that. We maybe just need to copy the headers that we need, or just install them. Or create a C++ wrapper ourselves if we want this. The GUIs can be done without this QML engine and language (and I personally prefer it at this point and for our use case).

That's why I mentioned that I'm still wrapping my head about this. There are many options, and the straight C++ API with Qt is easy to end up having and would my favorite, but it's not there yet. I would have to do some work upfront, but maybe that's not an issue given that we don't have that much to release this month, right? :-D

I'll definitely will work on this last issue, and probably will submit patches or make an experiment as I'm interested on this even outside Infinity Engine modding.

Final words and sorry for the wall of text :-)

As you can see, I've been looking into this quite heavily. Frida is very powerful and has tons of features. For those who have not done so yet, check the JavaScript API. It contains lots of useful things.

Since EEex is a lot of Lua so far, I would love to say that I can think of an obvious way to make things in Lua the way they were. I know of a great library called Sol2/Sol3 which has very nice Lua integration with C++. We can use that if you think it would lower the bar for future contributions. No experience with that myself though. If we can do it extending the built-in Lua engine that ships with the game, then that would be nice as well. I've never got how EEex works for real, so I can't comment on that too much.

Oh, the last thought. I think we can start toying with 2.5 instead of 2.6. We have the function names (Linux/Mac) or full debug info (Windows). Some things can't be fully smoothly done cross-platform as some code is inlined in one version and not on the other, for example. But we can start making things portable across CPUs with different address size.

Thanks if you read all of this. :-P

OlvynChuru commented 3 years ago

Well, I seriously hope we don't have to switch from Lua to another language. M__EEex.lua, the biggest Lua file in EEex, has over 7000 lines of code, and my own Lua file full of custom EEex functions I use in my mods is another 14000 lines of code. Changing all that code to a different language would be an enormous amount of work.

suy commented 3 years ago

Yes, I get this. I know it's a lot of code that I would not want it to go to waste either. I just can provide with advice if you ask me "how can we do this?" regards this or that. Nothing that can't do yourself by reading the docs and search online, it's just that I've been toying with this for a bit longer, probably.

The problem is that I don't see how to do some things in EEex. I don't know enough about how it works to comment on that. The initial idea of hooking the Lua functions already in the engine to do something else or something more it's perfectly possible. But I don't get how the existing Lua code would fit if we want to make it work in all platforms/CPUs. A self contained example from Bubb that was how to do printing of the combat log:

function B3PrintMessage(CMessageDisplayText)
    local name = EEex_ReadString(EEex_ReadDword(CMessageDisplayText + 0xC))
    local text = EEex_ReadString(EEex_ReadDword(CMessageDisplayText + 0x10))
    print(name..": "..text)
end

EEex_Once("B3InstallMessageRedirect", function()
    EEex_DisableCodeProtection()
    EEex_HookRestore(EEex_Label("CMessageDisplayText::Run"), 0, 6, EEex_FlattenTable({
        {[[
            !push(ebx)
            !push(ecx)
        ]]},
        EEex_GenLuaCall("B3PrintMessage", {
            ["luaState"] = {[[
                !mov_ebx_[dword] *_g_lua
            ]]},
            ["args"] = {
                {"!push(ecx)"},
            },
        }),
        {[[
            @call_error
            !pop(ecx)
            !pop(ebx)
        ]]},
    }))
    EEex_EnableCodeProtection()
end)

Some comments.

The size of objects should be the same on Win/Mac/Linux if all are 64 bit, so I suppose that is not a big issue. But a bit of work needs to be done to support 32 and 64 bits (so just Windows, but 2.5 and 2.6 at once), because the size of pointers doubles. For this easy example the first offsets used to read the CMessage need to be fixed. Check out how I did it in my script. My approach maybe it's unnecessary (classes, inheritance) for something such simple, but I was trying to think ahead of more complicated things.
The rest of the code, I can't comment, but it seems to me like it's assembly described in Lua. That certainly can't be easy to make cross-platform and cross-CPU. :-(

I've looked at other things, and much more can be saved in those. For example, the EXDAMAGE function for opcode 402. That probably just requires fixing the offsets for the EFF and the CRE, and fixing the functions used, which would be a shared effort.

One thing I can try to do is to go to the first EEex commit and look at the simplest possible EEex that just hooks into Lua and then just does... whatever it did in the first iteration. :) Or something simple like that. I can look into it if you want, but probably Bubb will be ahead of me in that. :-D

4Luke4 commented 3 years ago

Imagine EEex on Android and iOS? I think it should be possible.

If that's indeed possible, then I'm definitely interested...

ahungry commented 2 years ago

I would love EEx on GNU/Linux - I wanted to try Bubb's spell menu, but alas, it's apparently Windows only

suy commented 2 years ago

I was able to seduce Bubb with my praises of Frida to give it a try, but there were some difficulties. The latest release of EEex features a new architecture, and if I am not mistaken, a good chunk of the new Lua code is portable (or can be). I've not been able to dig into it to see if I could help in starting some Linux/UNIX support, but I know that Bubb has put some effort in satisfying Linux users. :) So there is hope for the future (specially if we can help).

Bubb13 / EEex