atom0s / Steamless

Steamless is a DRM remover of the SteamStub variants. The goal of Steamless is to make a single solution for unpacking all Steam DRM-packed files. Steamless aims to support as many games as possible.
Other
3.34k stars 211 forks source link

Fun fact: what is SteamStub32Var31Header::Unknown0001 #118

Open TsXor opened 4 months ago

TsXor commented 4 months ago

Note: this does not help in decrypting, so it is only a "fun fact" ;)

I used Ghidra to analyze a wrapped program and found what it is. In short, it is an offset to an steam-xor-encrypted string data. https://github.com/atom0s/Steamless/blob/cd770bf9749d3e4f438d23ac643917ad1a804257/Steamless.Unpacker.Variant31.x86/Classes/SteamStubHeader.cs#L43

The position of string data can be represented as below:

string_data_offset = image_entry_address - SteamStub32Var31Header::BindSectionOffset + SteamStub32Var31Header::Unknown0001
string_data_length = align(0x10, SteamStub32Var31Header::PayloadSize)

After obtaining the data, we do a self steam-xor, represented as this code:

def steam_xor(arr): # arr is casted to uint32 array
    for i in range(1, len(arr)):
        arr[-i] ^= arr[-i - 1]

We also need to xor the first 4 bytes of string data with original (before-xor) last 4 bytes of the 0xf0 bytes of header data. Then, split it by '\0', we'll get many parts, but number of strings is fixed 34:

strings = strings_data.tobytes().split(b'\0')[:34]

So what do we get? Let's see...

[b'calloc',
 b'free',
 b'vsprintf',
 b'TerminateProcess',
 b'GetLastError',
 b'OpenEventA',
 b'OpenFileMappingA',
 b'MapViewOfFile',
 b'WaitForSingleObject',
 b'CreateEventA',
 b'UnmapViewOfFile',
 b'CloseHandle',
 b'GetCurrentProcessId',
 b'VirtualAlloc',
 b'VirtualFree',
 b'VirtualProtect',
 b'IsBadReadPtr',
 b'OutputDebugStringA',
 b'FreeLibrary',
 b'afterimports',
 b'kernel32.dll',
 b'msvcrt.dll',
 b'user32.dll',
 b'GetProcAddress',
 b'GetModuleHandleA',
 b'LoadLibraryA',
 b'MessageBoxA',
 b'Local\\SteamStart_SharedMemFile',
 b'Local\\SteamStart_SharedMemLock',
 b'Steam Error',
 b'Application load error X:XXXXXXXXXX',
 b"Payload routine failed with %u ('%c')\n",
 b'Unpack step %u\n',
 b'_steam@12']

It's composed of 5 parts:

Another fun fact is how these strings are used. They have something to do with the dynamic function loading of the wrapper. https://github.com/atom0s/Steamless/blob/cd770bf9749d3e4f438d23ac643917ad1a804257/Steamless.Unpacker.Variant31.x86/Classes/SteamStubHeader.cs#L69-L73 It will check the first 4 function pointers and use the first non-null one to get the handle of kernel32.dll. "kernel32.dll" string comes from the string data above and L"kernel32.dll" string is not encrypted. After getting the handle, it will check if GetProcAddress is available in header. If not, it will manually read PE structure of kernel32.dll and get the pointer to GetProcAddress from its export table. Then, it will ensure GetModuleHandleA and LoadLibraryA is available and load msvcrt.dll with LoadLibraryA. "msvcrt.dll" string comes from the string data above. Finally, it loads C stdlib functions and Win32 kernel32.dll functions mentioned above. Wow, it is trying its best to avoid directly importing functions so that we cannot easily decompile it. I want to applause for its developers.

atom0s commented 4 months ago

Hello there, most of the Unknown fields in each of the header structures are actually 'known' but have not been filled in due to having caveats with them. I wrote most of the initial headers when I only had a small sample size of games to compare against, then over time have found that there are several revisions of SteamStub that alter the header in various manners. Due to that, some of the unknown fields don't always represent the same thing (and even other parts of the header shift around some) on every title, leading to them being unreliable to give a single unified name to. It's not too common to see since most games use the main revisions, but I've seen enough samples at this point that I'd rather not give a field a name that 'locks' in its purpose when it's not always that thing.

Over time I plan to rewrite Steamless to better handle all of the variants that exist and have it better handle properly deciding which variant is being used for the title that its being asked to unpack. But this is not a project I dedicate much time to currently, so that it is on the backburner along with a developer-mode I had started which goes into a lot more detail about the file, the header/stub and other information that is useful for someone wanting to learn more about SteamStub and what it has done to the file.

There are other edge-cases on some sub-variants that modify the header or populate it in a different manner as well. The most common part of the header affected by this kind of thing is the RVA handling. Some games will only specifically use the ANSI variants of the API's it needs (ie. LoadLibraryA, GetModuleHandleA, etc.) while other titles only use the Unicode variants. In some cases the header will not even include entries (at all) for the non-used type. There are also some instances where the RVAs wont be populated at all and instead, the stub will do the manual export lookups by reimplementing GetProcAddress itself to pull the needed API calls.

The string table handling you mentioned is also known and already reversed but not included inside of Steamless since it is not important to the unpacking process done by Steamless itself. I trimmed out extra nonsense that wasn't needed but left a few extra features (such as dumping the SteamDRMP.dll) for debugging purposes when its needed to review a potentially broken revision of the stub and needs manual review.

To expand on one of your comments regarding the string table as well:

SteamDrmp.dll main function name: _steam@12 (looks like __stdcall mangled) It is later called by the wrapper with parameter (entry_address, pointer_to_header, 0xf0).

This function is the main function exported by the SteamDRMP.dll and is used to do several tasks related to the unpacking process. (ie. additional anti-debug/anti-tamper, AES decryption of the main code section, etc.) The name of this function, and the name mangling, are not guaranteed and can change between variants of SteamStub. The main two names that are the most commonly used are start and steam. (Studios can modify the DRM to change this if they desire but in pretty much every case that doesn't happen.) The name does not matter and is simply used to lookup the export, but by-ordinal works as well since the function has always been exported as ordinal 1 that I've observed.

The mangling, however, does vary and so does the function prototype in general depending on the SteamStub variant/revision being used. It is not always the same call in regards to arguments and such. The mangling is also optional and will depend on the variant and how it was compiled. For example, the following are all observed:

The mangling also shows that the compiling has changed over time in how the function is coded/exported.

You can browse around the web to find more information in regards to how the mangling works, but as an example:

?start@@YGKKK@Z is converted to:

unsigned long __cdecl start(unsigned long, unsigned long);

For the C style function _steam@12 the mangling is much less useful. This simply states that the function has arguments that will make use of 12 bytes total. It does not explicitly give information on anything useful otherwise, such as the number of arguments, their types, any kind of return type, calling convention etc. This means additional manual reversing is needed to validate that information for each of the various variants etc.

In most cases though, the function is the same across an entire variant of SteamStub and has only changed between actual variants and generally not between revisions within a variant so the layout of the functions stay the same and are easily determined.