PrismaticFlower / swbf-unmunge

A totally groovy tool for getting access to game files for a game from 2005.
MIT License
26 stars 4 forks source link

Add support for Lua scripts #10

Open marth8880 opened 7 years ago

marth8880 commented 7 years ago

Currently scripts are simply extracted into munged .SCRIPT files. It would be great to be able to decompile these into raw Lua scripts.

styinx commented 2 years ago

Hey, I hope you guys are still partially active. I made an attempt to decode the binary contents of the scr_ block back to the Lua script. Unfortunately, I did not succeed (yet), but I wanted to share my progress with you. Maybe you got an idea what the results could mean, or if the scr_ block is simialr to other blocks.

Here goes what I found out so far (using the contents of the *.script files). Let's call it BLOCK1:

"ucfb" <uint32>                 // block size
"scr_" <uint32>                 // block size
"INFO" <2x uint32>              // seem to be always 2 uint32 variables with the value of 0x1
"BODY" <uint32>                 // block size

So far, no surprises. Now comes a magic part. Let's call it BLOCK2:

<110 bytes>                     // contain 3 strings "=(none)", "ScriptInit", and "=(none)" with their respective
                                // sizes which seems to be the function definition 
                                // "=(none)" may refer to the function parameters and the return values ??? IDK ...

I assume that now comes a lookup table for the strings contained in the lua script. Let's call it BLOCK3:

<uint32>                        // number of unique! strings in the script (probably a lookup table)
<uint32 (SIZE_1)>               // size of the first string 
<SIZE_1 bytes>                  // the first string
<uint32 (SIZE_2)>               // size of the second string 
<SIZE_2 bytes>                  // the second string
...                             // and so on

Now comes a list of 4 byte numbers. I assume that these are floats defining the camera shots. However, the numbers are not multiple of 7. (AddCameraShot takes 7 float parameters). Let's call it BLOCK4:

<uint32>                        // the number of 4 byte values that are following (I assume these are float values)
<4 byte number>                 // first 4 byte number
<4 byte number>                 // second 4 byte number
...                             // and so on 

Now comes the rest. I assume that here the actual calls to the functions are encoded. But I don't think that the number of parameters for each function is encoded here. I rather believe, that this is encoded in the C-Code. Let's call it BLOCK5:

<4 bytes>                       // seem to be always 4 bytes with the value 0
<uint32>                        // the number of 4 byte values that are following (I assume these are uint32 values)
<4 byte number>                 // first 4 byte number 
<4 byte number>                 // second 4 byte number 
...                             // and so on

And lastly the end of the file: Let's call it BLOCK6:

<4 byte number>                 // seems to be always 3
<4 byte number>                 // seems to be always 3
<1 byte>                        // representing EOL

Further notes:

EDIT: I just noticed that there is a branch where you also attempted to decode the script. I will keep digging :)

BAD-AL commented 2 years ago

Getting some of the Lua to 'decompile' is possible and I did write a tool that attempts this task (Lua 5.0.2). Likewise phantom567459 had started one for Lua 4 that can decompile/partially simple mission scripts.

There is this document that describes the bytecode.

And I gave this presentation on decompiling Lua last year (if you are interested).

It's a challenging problem and if you are interested I recommend that you decompile some scripts by hand and come up with your own ideas about how to decompile it programmatically (so that you do not pollute your mind with approaches that others have tried in the past).

styinx commented 2 years ago

I wasn't aware that is actually Lua byte code. That makes so much more sense now. I thought that the swbf developers created their own binary format.

I will definitely check out your talk and the documentation.

Thanks for the hints!

BAD-AL commented 2 years ago

I wasn't aware that is actually Lua byte code. That makes so much more sense now. I thought that the swbf developers created their own binary format.

I will definitely check out your talk and the documentation.

Thanks for the hints!

If you're still new to SWBF2 modding, I suggest the following page as a starting point: https://github.com/Gametoast/Documentation/wiki

styinx commented 2 years ago

Hi again, I managed to extract the Lua byte code (for version 4.0) and interpret the OP codes in the instructions. I haven't come around to create proper expressions from the instructions, but the current result of a script looks like this:

...
ReadDataFile,"GEO\geo1.lvl",()
SetDenseEnvironment,"false",()
SetMinFlyHeight,33554366,()
SetMaxFlyHeight,33554481,()
SetMaxPlayerFlyHeight,33554481,()
OpenAudioStream,"sound\geo.lvl","geocw_music",()
OpenAudioStream,"sound\cw.lvl","cw_vo",()
OpenAudioStream,"sound\cw.lvl","cw_tac_vo",()
OpenAudioStream,"sound\geo.lvl","geo1cw",()
OpenAudioStream,"sound\geo.lvl","geo1cw",()
SetBleedingVoiceOver,LOCAL,LOCAL,"rep_off_com_report_us_overwhelmed",33554432,()
SetBleedingVoiceOver,LOCAL,LOCAL,"rep_off_com_report_enemy_losing",33554432,()
SetBleedingVoiceOver,LOCAL,LOCAL,"cis_off_com_report_enemy_losing",33554432,()
SetBleedingVoiceOver,LOCAL,LOCAL,"cis_off_com_report_us_overwhelmed",33554432,()
SetOutOfBoundsVoiceOver,33554432,"repleaving",()
...

As you can see, some local variables are not resolved properly and some numbers are converted to ints instead of floats. I'll try to convert the OP codes into proper statements and expressions next.

I was wondering if you consider adding Lua as dependency, either by vcpkg or as submodule?