satan53x / SExtractor

从GalGame脚本提取和导入文本
GNU General Public License v3.0
226 stars 15 forks source link

A request #54

Closed Cosetto closed 7 months ago

Cosetto commented 7 months ago

Can you help me dump the English text from these scripts with SExtract? I just to get the text, no need for insert. image CC.zip

satan53x commented 7 months ago

This doesn't seem to be plain text, it may be compressed by the lzss algorithm.

Cosetto commented 7 months ago

Then, do you know any way to dump it?

satan53x commented 7 months ago

Try to decompress first. There may be some info bytes that should ignore in the file head.

In tools dir there are some python scripts using module lzss or zlib that show how to use.

(I use a phone now, maybe tomorrow to check it out)

satan53x commented 7 months ago

cc_decompress.zip

ContentStart = 0x18 in python script. File head has 0x18 bytes infomation, I just abandon them when decompress. If you want to compress back you need cache them.

satan53x commented 7 months ago

After decompress, choose engine BIN to extract.

00_skip=^[\S\s]{0,3}$
10_search=^[\S\s]([\x20-\xFC\r\n]+?)[\x00\x04]
checkJIS=[ -~\r\n]
ignoreDecodeError=1
separate=\xFD

Regex modified from _BIN_Violent, maybe you should check if there's more structure of bytes.

Cosetto commented 7 months ago

Wait, do I need to get a lzss.dll or something:

    uncom = lzss.decompress(com)
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats
satan53x commented 7 months ago

Forgot to say

pip install pylzss==0.3.4
Cosetto commented 7 months ago

Thanks again bro