trigger-segfault / TriggersTools.CatSystem2

A library for extracting from and working with the CatSystem 2 visual novel game engine.
MIT License
50 stars 8 forks source link

any chance of making a cstl decompiler #7

Open The-Math-God opened 2 years ago

The-Math-God commented 2 years ago

hi i'm working on Cyanotype Daydream on steam but all the English translations are on cstl files any chance of making a decompiler like the one you made foe CST files

lhy-cpu commented 1 year ago

Maybe you can try notepad++

ZillionClay commented 1 year ago

you can easily do this with regular expressions. this is a python script, which may be helpful.

# 用于粗略提取 CatSystem2 的 .cstl 文件中的翻译内容的脚本
# 输出在与脚本同目录下的 cstl_out.txt 文件中
# Script for rough extraction of translations in the cstl file of Catsystem2
# The output is in cstl_out.txt in the same directory as the script

import re

filename = "com04B.cstl"

f = open(filename, "r", encoding="utf-8", errors="ignore")

fo = open("cstl_out.txt","w", encoding="utf-8")

# 分割旁白的符号
# Symbols used to split narrations
dialogSep = re.compile(r"[\x00][\x00-\xff]")

# 分割对话与角色名的符号,但这可能不起作用
# Symbols used to split dialog and role names, but it may not work
roleSep = re.compile(r"[\x00-\xff]「|「")

unknowChar = re.compile(r"\\x[0-9a-f][0-9a-f]")

while True:
    line = f.readline()
    if not line:
        break
    line = re.sub(dialogSep, "\n", line)
    line = re.sub(roleSep, " 「", line)
    cstlStr = repr(line)

    # 替换剩余所有无法识别为文本的二进制为换行符
    # Replace any remaining binary unrecognized as text with a line break
    cstlStr = re.sub(unknowChar, "\n", cstlStr)

    cstlStr = cstlStr.split("\n")

    for s1 in cstlStr:
        betterStr = s1.split("\\n")

        for s2 in betterStr:

            # 过滤掉含日文假名的句子,目的是尽可能提取出翻译
            # Filter out sentences with Japanese kana 
            # in order to extract translations whenever possible
            if s2 and not re.search("[あ-んア-ン]", s2):
                fo.write('"{}"\n\n'.format(s2))

fo.close()
masagrator commented 1 year ago

This is my Python 3 CSTL extraction script that doesn't use regular expressions. There are no unknows left in terms of CSTL structure.

https://github.com/masagrator/NXGameScripts/blob/main/Cyanotype%20Daydream/CSTL_Extractor_PC.py

Just provide folder with cstl files as argument

f.e.

python CSTL_Extractor_PC.py scenes