Closed sakurahana90 closed 5 months ago
Some Cyberworks use UTF-16
encoding.
Try to choose regex None
to extract.
Upload some a0 files then I will check it out if it doesn't work too.
regex none also didn't work unfortunately...
here's some script text.zip
Cannot download the zip, error 404.
I tried to upload but still 404 not found, how about google drive link ?
I check it out, it's another Cyberworks format without encryption. You can use VNT to extract.
I also tried VNT to extract but when I put it in the script again and repack them, it didn't work and crashes the game when booting up. Is it about limitation of characters inside the string?
Generally speaking, Cyberworks does not have a character limit, but it's possible that your version of the engine does. You can try extracting it and adding a few words in the first sentence. If it crashes, try modifying the text without increasing the character count to determine if that's the issue. (Both addition and modification are done using Japanese to eliminate any potential interference factors.)
I'm stuck, I tried adding a few character and also tried to change one character, both have the same result, the game crashed. I also tried with no change in script, just unpack then repack again, the game plays normally. Just that confirm nothing's wrong with unpack and repack tool.
I am using VNT to extract and reinserting the script, is unencrypted script not supported in SExtractor?
Yes, it's not supported. Not only is it unencrypted, but the file structure also differs.
also tried to change one character
What language is it changed to? Will it still be Japanese after the modification? Don't use English for now.
yes, I used japanase, to be exact, I just copied one character from the string and paste it to the same string, or replaced one character to another character in the same string.
I also have tried to compare the extracted string and string that shown in the game and I found some string didn't get extracted, perhaps it's one of the reason too.
Well, I have to be patient then. Thank you for all the answer, I appreciate it really.
That's odd. What's the name of the game?
It's an old game from Tinkerbell https://vndb.org/v1036
in case you want to check it out, here's the file game (this is the official patch and including the scenario archive)
Support for the old version of Cyberworks
has been added.
The recommended regex is as follows:
00_skip=^$
10_search=^(?P<name>【.+?】)[\xFE]{0,1}$
11_search=^(?P<name>【.+?】)(.+?)[\xFE]{0,1}$
15_search=^(?P<unfinish>.+?)[\xFE]{0,1}$
extraData=readJIS,noTextLen
structure=paragraph
Mainly, the addition of the noTextLen
parameter is needed.
And it's also not to limit the text length.
If you use CSystemArc.exe
to repack, pay attention to the file version prompted during unpacking. Your game is version 22.
thank you so much for this, it's getting closer.
I tried using the recommended regex and the process went smoothly but there's problem when the generated file only 1 KB for each file (I'm using multiple file for extracting), there's a string there but only one line.
I can extract it normally. I'm not sure what problem you're encountering.
What does your console print when extract?
I also tried extracting script using GARBro but same result, maybe my os is corrupt or something...
Can you pack the entire SExtractor folder and upload it?
Finally I found the problem, my AV seemed to delete Injector Xenos.exe. I found out after tried to re-download SExtractor dan unzip it, and there's notification of something being detected and then deleted. After I turn it off, I can see the right generated file.
Thank you so much for the help!
It's imported with GBK encoding in Cyberworks
engine.
If you only need Japanese and English, select Encoding applies to BIN
and choose cp932
at the bottom right.
I've tried it and the game still won't boot up. Here's my method:
Is there something wrong with the way I'm using the tools? I also tried to change the administrative locale of my pc to japanese but no changes.
how silly of me, I don't think that there also some changes on Arc01.dat so I didn't copy it to the game folder, after I copy it too, the game can boot up.
Thank you very much for the help, I'm so happy that I finally can translate this game.
If I may adding something I found, there are some lines that didn't get extracted into json. I found it after compared with json I extracted with VNT, and if I tried to add the missing dialogue to the json, nothing is changed, and the game keeps rendering the original line.
There's also some symbol turned into a random character, for example the dialogue in the picture 「Huh? Isn't that the same thing?」, the symbol 」 changed into dot but I don't think this is quite problematic, at least for me.
Edited: I found the second problem, the first and last script of the archive contains choice text of the game, and if I translate them, it won't work and unable to click in game, but when I use original file, which is japanese, it worked again.
I don't know if I edited the file wrongly but I did the same like the rest.
00_skip=^$
10_search=^(?P<name>【.+?】)[\xFE]{0,1}$
15_search=^([\S\s]+?)[\xFE]{0,1}$
extraData=readJIS,noTextLen
structure=paragraph
I checked and it seems that not all 【】
brackets contain the names of the speakers; there are also names in narration. (So delete the 11_search
)
Additionally, there are control bytes inside, so the .
needs to be changed to [\S\s]
to ensure it can be matched. (Its byte is 0x10
, which happens to be equivalent to \n
that .
cannont match)
"message": "舌を突き出して汗をダラダラかいてるこいつは、【\nさとなか しゅうへい\u0005里中 秀平】。"
The control byte represents the phonetic transcription of the name.
You can delete the phonetic transcription. If you want to keep it, you need to modify it to correspond to the length of the text. (
is the start bytes, \n (\u0010)
and \u0005
is the text length)
It's recommended to delete it, just keep the name. Because your translation and phonetic symbols are not match.
"message": "舌を突き出して汗をダラダラかいてるこいつは、【里中 秀平】。"
Hi bro, I was also trying to extract a game of cyberworks named-https://vndb.org/v1078 everything works fine but the game is showing the original japanese text rather than English-- I think there is some problem with exporting of the .a0- file. Also can you merge these new reagex pattern in your tool as it works for old games of Cyberworks, And I have tried vntextpatch but the files exported from VNTextPatch crashes the game after showing a single translated line.
everything works fine but the game is showing the original japanese text rather than English--
Extract Dir
and upload it.00_skip=^$ 10_search=^(?P<name>【.+?】)[\xFE]{0,1}$ 15_search=^([\S\s]+?)[\xFE]{0,1}$ extraData=readJIS,noTextLen structure=paragraph
I checked and it seems that not all
【】
brackets contain the names of the speakers; there are also names in narration. (So delete the11_search
) Additionally, there are control bytes inside, so the.
needs to be changed to[\S\s]
to ensure it can be matched. (Its byte is0x10
, which happens to be equivalent to\n
that.
cannont match)"message": "舌を突き出して汗をダラダラかいてるこいつは、【\nさとなか しゅうへい\u0005里中 秀平】。"
The control byte represents the phonetic transcription of the name. You can delete the phonetic transcription. If you want to keep it, you need to modify it to correspond to the length of the text. (
is the start bytes,\n (\u0010)
and\u0005
is the text length) It's recommended to delete it, just keep the name. Because your translation and phonetic symbols are not match."message": "舌を突き出して汗をダラダラかいてるこいつは、【里中 秀平】。"
Thank you so much for the pointers. It's going great with the script. Perhaps the rest is how to translate the choice lines since translating them normally will end up with not clickable choice in the game. For now I use the original script for filler.
input.zip Here's the files any import was successful. the 000004.a0 is the file that displays the text at the opening o the game.
input.zip Here's the files any import was successful. the 000004.a0 is the file that displays the text at the opening o the game.
The extracted JSON from the folder you sent seems to be incorrect.
I'm not sure why, as our regular expressions should be the same, right?
(Choose __Custom0
and paste the regex there. It can save last regex after extract)
After checking Encoding applies to bin it gives this error-- and when i downloaded your new repository and ran the command run.bat , I have got some error like syntax error after that Se extractor started fine and the json files are still same , should i try to use a fresh SE extractor repository.
TXT Encoding choose cp932
, it's the alias for shift-jis
.
The error for first boot for se-extractor-
E:\SE EXTRACTOR\SExtractor-main\src\var_extract.py:53: SyntaxWarning: invalid escape sequence '\.' symbolPattern = '[\.~ \\u3000-\\u303F\\uFF00-\\uFF65\\u2000-\\u206F\\u2600-\\u27FF]' #重新分割匹配字符 New Config mainDirPath . New Config engineCode 0 New Config outputFormat 0 New Config outputPartMode 0 New Config mergeDirPath . New Config mergeSkipReg ^[a-zA-Z0-9{] New Config collectSep + New Config regIndex 0 New Config encodeIndex 0 New Config maxCountPerLine 512 New Config splitParaSep \r\n New Config cutoff False New Config cutoffCopy True New Config splitAuto False New Config ignoreSameLineCount False New Config ignoreNotMaxCount False New Config fixedMaxPerLine False New Config pureText False New Config transReplace True New Config preReplace False New Config skipIgnoreCtrl False New Config skipIgnoreUnfinish False New Config ignoreEmptyFile True
Everything WORKED out. THANKS.
What's your python version? Requires 3.9 and above, recommended is 3.11.
Comment
Python 3.12
That's odd. Don't know why.
That's odd. Don't know why.
Everything is working out fine now, Thanks.
I was just asking that can you fix csystemarc.exe as it do not extracts ARC00.dat and gives error like-
E:\SE EXTRACTOR\Cyberworks>.\CSystemArc.exe readconfig .\Arc00.dat config.xml UnpackItems: item not start with 'S' UnpackItems: item not start with 'S' Found invalid data while decoding.
There's also some symbol turned into a random character, for example the dialogue in the picture 「Huh? Isn't that the same thing?」, the symbol 」 changed into dot but I don't think this is quite problematic, at least for me.
Perhaps because the game is read in full-width characters, the English translation requires double-byte counts between English and Japanese full-width symbols.
Such as 「Huh? Isn't that the same thing?」
, the English text length is 31, just try to add a space.
「Huh? Isn't that the same thing ?」
I was just asking that can you fix csystemarc.exe as it do not extracts ARC00.dat and gives error like-
E:\SE EXTRACTOR\Cyberworks>.\CSystemArc.exe readconfig .\Arc00.dat config.xml UnpackItems: item not start with 'S' UnpackItems: item not start with 'S' Found invalid data while decoding.
This tool is a modified version in SE and is used to read UTF-16. You can just use the original version with shift-jis. https://github.com/satan53x/SExtractor/blob/main/tools/Cyberworks/README.md
I was just asking that can you fix csystemarc.exe as it do not extracts ARC00.dat and gives error like-
E:\SE EXTRACTOR\Cyberworks>.\CSystemArc.exe readconfig .\Arc00.dat config.xml UnpackItems: item not start with 'S' UnpackItems: item not start with 'S' Found invalid data while decoding.
This tool is a modified version in SE and is used to read UTF-16. You can just use the original version with shift-jis. https://github.com/satan53x/SExtractor/blob/main/tools/Cyberworks/README.md
That version too gives error about found invalid data while ecoding.
You sure? I use original version normally. https://github.com/arcusmaximus/CSystemTools/releases/tag/1.1
Try this--file Arc00.zip
https://github.com/satan53x/SExtractor/blob/main/tools/Cyberworks
The original version has been modified and use CSystemArc_JIS.exe
.
https://github.com/satan53x/SExtractor/blob/main/tools/Cyberworks The original version has been modified and use
CSystemArc_JIS.exe
.
Thankyou very much, I'll check it and tell you about how's it working.
Se Extractor is unable to extract the following cyberworks script-- gaiden.zip
The structure is slightly different, with 4 bytes of 00 at the beginning of each line.
Maybe this regex can extract that add \x00{4}
.
00_skip=^$
10_search=^\x00{4}(?P<name>【.+?】)
15_search=^\x00{4}(?P<unfinish>[\S\s]+?)[\xFE]{0,1}$
extraData=readJIS,noTextLen
structure=paragraph
But your text had been translated by shift-jis tunnel
way.
shift-jis tunnel
use illegal shift-jis bytes such as 81 01
to show expanded charactor, and them can't be decode to text.
So you should extract the original japanese script but not translated one.
The structure is slightly different, with 4 bytes of 00 at the beginning of each line. Maybe this regex can extract that add
\x00{4}
.00_skip=^$ 10_search=^\x00{4}(?P<name>【.+?】) 15_search=^\x00{4}(?P<unfinish>[\S\s]+?)[\xFE]{0,1}$ extraData=readJIS,noTextLen structure=paragraph
But your text had been translated by
shift-jis tunnel
way.shift-jis tunnel
use illegal shift-jis bytes such as81 01
to show expanded charactor, and them can't be decode to text. So you should extract the original japanese script but not translated one.
Okay
But it's still not extracting any text.
I tried to extract the game scenario using default setting and the default regex, Cyberworks JIS 10_search=^(?P【.+?】)
15_search=^(?P.+?)[\xFE]{0,1}$
it will throw error "UnicodeDecodeError: 'cp932' codec can't decode byte 0x84 in position 1935: illegal multibyte sequence decoding with 'cp932' codec failed"
Then, I tried to change "?" with "?"
It went to the end, but the output only "{}" in every output file
Did I use wrong setting somewhere?