ReGlacier / ReHitmanTools

Tools for ReHitman project
4 stars 1 forks source link

In-game formats status #3

Open DronCode opened 4 years ago

DronCode commented 4 years ago

All formats list:

DronCode commented 3 years ago

PRP Format:

Header

Header. Flags field description

In the most cases used 0xD value. In hex it means 0b1101

Bit nr What's mean
0 ???
1 is save?
2 behaviour of TAG_StringArray, TAG_StringOrArray_E and TAG_StringOrArray_8E
3 is tokens table presented
4 Not used
5 Not used
6 Not used
7 Not used

So, in usual case value 0xD means:

Bits from 4 to 7 aren't used.

Little note: game loading 4 bytes, but actually used only a few bits. Maybe I lost something but it looks weird.

Tokens

    struct ZToken
    {
        static constexpr PRPToken Void    = 0x0FFFFFFFF;
        static constexpr PRPToken Unknown = 0x0FFFFFFFE;
        static constexpr PRPToken Joker   = 0x0FFFFFFFD;
    };

Tags

enum PRP_ETag : uint8_t
    {
        TAG_Array               = 0x1,
        TAG_BeginObject         = 0x2,
        TAG_Reference           = 0x3,
        TAG_Container           = 0x4,
        TAG_Char                = 0x5,
        TAG_Bool                = 0x6,
        TAG_Int8                = 0x7,
        TAG_Int16               = 0x8,
        TAG_Int32               = 0x9,
        TAG_Float32             = 0xA,
        TAG_Float64             = 0xB,
        TAG_String              = 0xC,
        TAG_RawData             = 0xD,
        TAG_Bitfield            = 0x10,
        TAG_EndArray            = 0x7C,
        TAG_SkipMark            = 0x7D,
        TAG_EndObject           = 0x7E,
        TAG_EndOfStream         = 0x7F,
        TAG_NamedArray          = 0x81,
        TAG_BeginNamedObject    = 0x82,
        TAG_NamedReference      = 0x83,
        TAG_NamedContainer      = 0x84,
        TAG_NamedChar           = 0x85,
        TAG_NamedBool           = 0x86,
        TAG_NamedInt8           = 0x87,
        TAG_NamedInt16          = 0x88,
        TAG_NamedInt32          = 0x89,
        TAG_NamedFloat32        = 0x8A,
        TAG_NamedFloat64        = 0x8B,
        TAG_NamedString         = 0x8C,
        TAG_NamedRawData        = 0x8D,
        TAG_NameBitfield        = 0x8F,
        TAG_UNKNOWN, //14,16-123,128,142
        NO_TAG,

        /// My tags
        TAG_StringOrArray_E = 0xE,
        TAG_StringOrArray_8E = 0x8E,

        TAG_StringArray = 0xF,
    };

PRP File Map

Each PRP file contains those sections:

Be aware of the order of entries in the "Properties" section is important for the game loader. The order of entries in GMS and their properties in PRP must be the same.

ZDefines

Each block in ZDefine has those sections:

There are 6 kinds of type id:

Properties and geoms

Each geom in the GMS file was stored in the same order as in the PRP. But geoms are represented as the tree hierarchical structure and IOI holds children nodes in the separated containers on the end of the current object as the member of the container. So, each object represented as:

DronCode commented 3 years ago

GMS Format notes:

There are 2 kinds of GMS format:

Any GMS files start with the same header:

Raw header

#pragma pack(push, 1)
struct SGMSHeader_t
{
    unsigned int iUncompressedSize { 0 };
    unsigned int iBufferSize { 0 };
    bool bIsUncompressed : 1 { false };
};
#pragma pack(pop)
Offset Size Field Description
0x0 4 iUncompressedSize The size of uncompressed body
0x4 4 iBufferSize The size of compressed body
0x8 1 bIsUncompressed This flag means that body is uncompressed

The total size of the structure is 9 bytes. #pragma pack(push, 1) is required. Note: game understanding bIsUncompressed flag as an action to copy raw body instead of decompressing it. It could be useful in case when we need to debug in-memory our binary structure.

Contents map (0x90 bytes)

Offset Name Used in Description
0x0 Entites ZEngineDataBase::CreateGeoms Table holds basic information about entities on scene
0x4 Expected 0
0x8 Expected 0
0xC Expected 4
0x10 Geom stats ZEngineDataBase::CountGeomsNr Basic information about usage of entities on scene
0x14 Groups hierarchy info ? Some sort of group infos & clusterization
0x18 ? ZEngineDataBase::CreateGeoms, CListUser::ConvertOffsetsToRefs Information about lighting on scene (stored inside CListUser o_0)
0x1C Events data ZEngineDataBase::CreateGeoms Count of ZEventBase things on scene
0x20 Weird data, looks like relative to materials ZGeomBuffer::InitResourceGeoms ?(Pointer)
0x24 Unknown list of something ZGeomBuffer::InitResourceGeoms ?(Pointer)
0x28 ? ? ?(Pointer)
0x2C ? ? ?(Zeroed)
0x30 Materials ZEngineDataBase::AllocSequence ?
0x34 Path finder data ZEngineDataBase::InitPathfinder4Data Some kind of PF4 data
0x38 Physics data ZEngineDataBase::AllocSequence UNUSED; Pointer to physics data, always 0xFFFFFFFF
0x3C Base geoms pool size ZEngineDataBase::CreateGeoms Contains uint32_t value how much base geoms should be allocated on stack
0x40 Weapon handles ? Array of u32 hashes of used weapons
0x44 Excluded animations list ? Length & array of animation names which was excluded from scene
0x48 Zeroed ? Idk
0x4C Zeroed ? Idk
0x50 Zeroed ? Idk
0x54 Zeroed ? Idk
0x58 Zeroed ? Idk
0x5C Zeroed ? Idk
0x60 Zeroed ? Idk
0x64 Zeroed ? Idk
0x68 Float FP32, usually 1.0 ? Idk
0x6C Zeroed ? Idk
0x70 Float FP32, usually 1.0 ? Idk
0x74 Zeroed ? Idk
0x78 Float FP32, usually 1.0 ? Idk
0x7C Zeroed ? Idk
0x80 Zeroed ? Idk
0x84 Zeroed ? Idk
0x88 Zeroed ? Idk
0x8C Zeroed ? Idk

More detals

Entities table (+0x0):

On +0x0 placed offset of U32 with an address of entities header table:

struct EntityHeader {
    uint32_t entOffset;  //real offset from top header is pBase + (4 * (entOffset & 0xFFFFFF)); Also, this value contains depth level
    uint32_t alwaysZero; // ?
};

struct EntitiesHeader {
    uint32_t totalEntities;
    EntityHeader headers[totalEntities];
};

Each entity is a structure of 0x40 bytes size:

   +0x0 - name (in BUF file offset)
   +0x4 - ?
   +0x8 - ?
   +0xC - primitive id
   +0x10 - ?
   +0x14 - type id
   +0x18 - ?
   +0x1C - control flags
   +0x20 - (in PRM file offset, don't know what is it)
   +0x24 - ?
   +0x28 - ?
   +0x2C - ?
   +0x30 - id
   +0x34 - ?
   +0x38 - ?
   +0x3C - ?

Geom stats (+0x10)

At this chunk stored basic information about geom usage stats.

It could be represented via this structure:

struct GeomStatsByType {
    u32 typeId;
    u32 entitiesCount;
    u32 unk8;
};

struct GeomStats {
    u32 statsCount;
    GeomStatsByType statsByType[statsCount];
};

In most cases, GeomStatsByType::unk8 is zeroed but sometimes it contains some positive value. Game code sub this value from total allocation size, so I guess that value doesn't matter for us. Also, the game will add 0x112 at the end of ZEngineDataBase::CountNrGeoms. It looks like a hack (as always).

See ZEngineDataBase::CountNrGeoms (0x0045F2D0) for details.

Groups & clusters info (+0x14):

IN PROGRESS: This part contains information about groups & thier members.

struct GroupClusterInfo
{
    u32 groupsCount;
    u32 clusterInfos[24][groupsCount];
};

Events data (+0x1C):

Here stored a count of ZEventBuffer instances per level.

If events count < 0x40000 game will allocate 0x40000 events.

For some "special" levels game will allocate extra event buffers:

Any level M09 M10
0x57800 +0x53000 +0x70800

Chunk 20 (+0x20):

Some weird block, contains data almost looks like list of pairs: some type and 2 values. For M13: +0x00: 02 00 00 00 - length +0x04: 42 4F 43 5A - name of block "BOCZ" +0x08: 00 00 00 00 01 00 00 00 - two 4 byte values +0x10: 4D 45 54 49 - name of block "METI" +0x14: 01 00 00 00 12 00 00 00 - two 4 byte values

Also, this data block aligned by 0x10, last 00 00 00 00 sequence is just alignment.

Chunk 24 (+0x24):

Dynamic size array.

Each block has size 0x8 bytes:

struct SChunk24
{
    uint32_t unk0;
    uint32_t unk4;
};

Aligned by 0x10

Materials (+0x30):

Offset Type Description
0x0 Pointer ?
0x4 Pointer ?
0x8 Pointer Pointer to properties string table
0xC Int32 ?
0x10 Pointer ?
0x14 Pointer ?

TODO: Complete reverse

Weapon handles (+0x40):

On +0x40 placed offset of U32 with an address of weapon handles list.

Excluded animations table (+0x44):

On +0x44 placed offset of U32 with an address in BUF file. It contains list of strings and their lengths:

+0x0 - length of string
+0x4 - the start of the string

Path Finder data (+0x34)

On +0x34 placed offset of U32 with an address of PF4RunTime data. The next U32 - size of PF4 data buffer. Then stored next structure:

struct SPF4DataBlock
{
    short field0;
    short field2;
    short field4;
    short field6; //x4 - allocate memory, x4 + 0x3E - third alloc
    short field8;
    short fieldA; //x2 - allocate memory
    short fieldC;
    short fieldE;
    short field10;
    short field12;
    short field14;
    short field16;
    short field18;
    short field1C;
    int field20;
};

Read implementation is here .text:0045A930 ; int __thiscall ZEngineDataBase::InitPathfinder4Data(ZEngineDataBase *this, int data)

Physics data (+0x38, legacy):

On +0x38 could be placed offset with physics data. Usually it's 0xFFFFFFFF to avoid loading physics data from GMS.

Entities pool size (0x3C):

This section contains how much ZBaseGeoms should be allocated at runtime + 1

DronCode commented 2 years ago

Few tools for PRP:

Also, few words about my 'custom' PRP tags:

TAG_StringOrArray_E = 0xE,
TAG_StringOrArray_8E = 0x8E,

I'm thinking that TAG_StringOrArray_E and TAG_StringOrArray_8E actually is TAG_Enum and TAG_NamedEnum but I'm not sure that I'm right, need to test it.

DronCode commented 2 years ago

GMS: Read entities stats data PoC script:


class GeomStat:
    def __init__(self, type_id: int, count: int, unk8: int):
        self._type_id = type_id
        self._count = count
        self._unk8 = unk8

    @property
    def type_id(self) -> int:
        return self._type_id

    @property
    def count(self) -> int:
        return self._count

    @property
    def unk8(self) -> int:
        return self._unk8

class GeomStats:
    def __init__(self, stats_data: bytes):
        self._stats = []

        stats_count: int = struct.unpack('<i', stats_data[0:4])[0]
        stat_idx: int

        for stat_idx in range(0, stats_count - 1):
            type_id: int
            count: int
            unk8: int
            type_id, count, unk8 = struct.unpack('<iIi', stats_data[4 + (stat_idx * 0xC): 4 + ((stat_idx + 1) * 0xC)])
            self._stats.append(GeomStat(type_id, count, unk8))

    @property
    def stats(self) -> [GeomStat]:
        return self._stats

def prepare_gms_body(gms_body: bytes):
    stats_begin_at: int = struct.unpack('<i', gms_body[0x10:0x14])[0]
    geom_stats: GeomStats = GeomStats(gms_body[stats_begin_at:]) # All stats here
DronCode commented 2 years ago

PRM File Format:

This file contains geometry, lighting, bones, textures and material linking for models (primitives as "Glacier" says). PRM File contains few sections:

Header

This structure contains base information about contents in PRM file.

Name Offset Size Description
Descriptors Offset 0x0 0x4 (u32) Where descriptors list are stored
Primitives count 0x4 0x4 (u32) How much primitives (without their variations) stored in PRM
Descriptors Offset 2 0x8 0x4 (u32) Usually it's same to "Descriptors Offset"
Zeroed 0xC 0x4 (u32) Zero

Descriptor

This structure contains information about primitive chunk (where it's stored, how much memory used for)

Name Offset Size Description
Declaration offset 0x0 0x4 (u32) Offset (from beggining of the file) where descriptor stored
Declaration size 0x4 0x4 (u32) Size of chunk, allocated for descriptor
Should be loaded 0x8 0x4 (u32, but it's bool) If this value marked as false primitive will not be loaded in game and will be ignored in primitives table. It's useful for primitives injection technique
Unknown C 0xC 0x4 (u32) Really don't know what is it, in most cases zeroed

PrimitiveChunk

This is the hardest thing in PRM format. This structure has dynamic size and it's defined by it's kind. Known kind values:

Value Description
0 ?
1 ?, but total structure size is 0xA0 (?). Theory: it's primitive with bones. Used at 0046D680
4 ? Usage reference 0x00487480 & 0x00487450
6 ?
7 Generic model?
8 Usage reference: 00487740
10 ?
11 ?
12 Usage references: Fn at 0046D250, 0046D220, 0046D1F0, 0046D1C0, 0046CFE0 (!!!), 0046CFA0, 0046E1B0 (!!)

The kind represents how much bytes will be used for primitive.

To be continued

BoneInfo

Name Offset Size Description
Bones count 0x0 0x4 (u32) How much bones in primitive

To be continued

DronCode commented 1 year ago

GMS Hierarchy storage format

It's been a long time research, but now I know how it works. First of all we need to discuss about EntityHeader structure.

Previously, I said this as

struct EntityHeader {
    uint32_t entOffset;  //real offset from top header is pBase + (4 * (entOffset & 0xFFFFFF)); Also, this value contains depth level
    uint32_t alwaysZero; // ?
};

but now I know that second entry really zeroed, but first entry is a little bit more difficult:

Another note that when 25'th bit is set we should know, that entity may contains another entities (ie entity is a ROOT for something).

So, depth is a parameter, which means that actual parent is depth'th object in current path. What is path? Path is an list of ROOT objects. Hierarchy parse algorithm is pretty simple:

  1. Put ROOT object (entity at 0) into path array
  2. Take an object and calculate path length as path.size() - depth
  3. Take a slice of path as [0: path.size() - depth]
  4. Last element of path slice is a parent of our object
  5. Save new slice as a path
  6. If current object is a ROOT - push our object to back of path
  7. Repeat from step 2.
DronCode commented 1 year ago

SND, STR, WAV and WHD formats are completely reverse engineered by @WSSDude here: https://github.com/WSSDude/Glacier1AudioTool/tree/main/docs/HitmanBloodMoney/Formats