Puxtril / Warframe-Exporter

Converts Warframe's custom file formats into standard formats
34 stars 0 forks source link

Languages.bin Parser #20

Open reitowo opened 10 months ago

reitowo commented 10 months ago

Here's an Languages.bin Parser in C# for v35. If you have time, you can convert to c++.

Dictionary<string, List<string>> ParseLanguageBin(string fileName) {
    var subtitleDict = new Dictionary<string, List<string>>();

    using var file = File.OpenRead(fileName);
    using var reader = new BinaryReader(file);

    var hash = reader.ReadBytes(16);
    var version = reader.ReadInt32();
    var unkA = reader.ReadInt32();
    var unkB = reader.ReadInt32();

    var langCodes = new List<string>();
    var langCount = reader.ReadInt32();
    for (var i = 0; i < langCount; i++) {
        var lang = Encoding.UTF8.GetString(reader.ReadBytes(reader.ReadInt32()));
        langCodes.Add(lang);
    }

    while (reader.BaseStream.Position != reader.BaseStream.Length) {
        var contentSize = reader.ReadInt32();
        var zstdDict = reader.ReadBytes(contentSize);

        var unk0 = reader.ReadInt32();

        using var options = new DecompressionOptions(zstdDict, new Dictionary<ZSTD_dParameter, int>() {
            { (ZSTD_dParameter)1000, 1 }
        });
        using var decompressor = new Decompressor(options);

        for (var k = 0; k < unk0; ++k) {
            var s1Len = reader.ReadInt32();
            var s1 = Encoding.UTF8.GetString(reader.ReadBytes(s1Len));

            // Console.WriteLine($"{reader.BaseStream.Position} {reader.BaseStream.Length} {s1}");

            var s2Len = reader.ReadInt32();
            var s2Bytes = reader.ReadBytes(s2Len);

            var unk1 = reader.ReadInt32(); // array 1 len

            for (var i = 0; i < unk1; ++i) {
                var s3Len = reader.ReadInt32();
                var s3Bytes = reader.ReadBytes(s3Len); // not null terminate
                var s3 = Encoding.UTF8.GetString(s3Bytes);

                var s2Skip = reader.ReadInt32();
                var s2Take = reader.ReadUInt16();
                var s2Flag = reader.ReadUInt16();

                var s2Slice = s2Bytes.AsSpan(s2Skip, s2Take).ToArray();
                var s2 = Encoding.UTF8.GetString(s2Slice);

                if ((s2Flag & 0x200) != 0) {
                    using var br = new BinaryReader(new MemoryStream(s2Slice));
                    var dstLen = br.Read7BitEncodedInt();
                    var srcBuf = br.ReadBytes((int)(br.BaseStream.Length - br.BaseStream.Position));
                    var dst = new byte[dstLen];
                    decompressor.Unwrap(srcBuf, dst, false);
                    s2 = Encoding.UTF8.GetString(dst);
                }

                // Console.WriteLine($"{s1} {s3} {s2Flag:X4} {s2Take} {s2Skip} {s2Bytes.Length} {zstdDict.Length} {s2}");
                subtitleDict.TryAdd(s3, []);
                subtitleDict[s3].Add(s2);
            }
        }
    }

    return subtitleDict;
}
Puxtril commented 10 months ago

Thanks for providing this!

I think this could be added in with an extra flag, like --extract-laguanges.

My only issue is how much this will lead to datamining, I don't want to give players easy access to tools that leak upcoming content; I'll have to investigate before adding this. That's somewhat hypocritical because extracting models and textures is technically datamining, but I think DE cares more about story/dialogue than models. Or at least, this extactor hasn't lead to any big reveals in the year+ it's been around (AFAIK).