EasyRPG / Tools

Assorted tools to handle RPG Maker 2000/2003 files
https://easyrpg.org/tools/
49 stars 18 forks source link

Add tool: LcfTrans #21

Closed Ghabry closed 3 years ago

Ghabry commented 8 years ago

This is a tool I'm working on since months (with big breaks in between).

It goes through the LDB file and all LMUs and extracts strings out of it and writes them in a po file. Strings with same content and context are merged into one (Required by the format)

Writing back is NOT supported. Rendering this tool semi-useless currently because you can't do anything with the translations :( A super simple po reader exists (fromPO) which can be used as a base. Not sure if writing back makes sense at all. The ultimate goal would be adding "loadTranslation" to liblcf and support on-the-fly language switching ingame this way.

The tool lacks some polishing:

BlisterB commented 8 years ago

It's a really good idea ^^. That sure will be usefull for the editor!

A french guy did a similar program called DreaMaker. I used it to translate a game, it was usefull but had some problems. http://rpgmaker.net/forums/topics/2988/

Ghabry commented 8 years ago

Yeah I know that tool. Well "Not invented here". The tool has some Problems: No sourcecode, not Multi platform, translation file format is non-standard. But it also Supports RPG Maker XP, so it has some advantage over my tool.

Usually I prefer reusing existing stuff but here I made an exception to gain extra flexibility.

And my motivation was not to support writing the strings back into the binary files but to support multi-language in Player, and this is the first step to it.

BlisterB commented 8 years ago

Oh yeah of course I was not saying "you should use it" but just "extract string is a really good idea". I have no doubt that your implementation will be more suitable to your goals ^^.

Ghabry commented 8 years ago

Looking forward to the first multilanguage versions of Yume Nikki and Ib :P

BlisterB commented 8 years ago

The Player will be abble to switch from a trad file to another :p ?

Ghabry commented 8 years ago

before 2017 :P. So you can start translating your favourite game to French already :P

BlisterB commented 8 years ago

Omg great Ghabry! I was thinking to translate The Way ^^.

Zegeri commented 8 years ago

It produces erroneous output when using two consecutive line breaks.

msgctxt "event"
msgid ""
"foo\n"
\n"
"bar"
msgstr ""

The ultimate goal would be adding "loadTranslation" to liblcf and support on-the-fly language switching ingame this way.

+1 to that.

Besides DreaMaker, there's also RPGMaker Trans, GPL and supports from 2k to VX Ace.

Ghabry commented 8 years ago

Thx for your feedback. The issues reported by zegeri and carstene1ns got fixed

Ghabry commented 3 years ago

@sorlok This is mostly finished now (there can be bugs ;))

It generates now the following files:

How message strings are constructed:

A new message starts at a ShowMessage event command. When it is a multiline message (event cmd ShowMessage_2) they are merged together. There is no length limit (but everything >4 needs a hacked editor)

Lonely ShowMessage_2 is reported as an error.

A Choice is merged with the previous message when it fits in the window (message lines + choices <= 4) When it doesn't fit it is a lonely message (this is the same logic RPG_RT uses).


Except for the first two commits I killed now all further commits. For historical reasons they can be found here: https://github.com/Ghabry/easyrpg-tools/tree/lcftrans-legacy

The code is now much cleaner, I can see that my C++ expertise increased alot in the last 4 years. The old code was terrible.


The only open research point is providing Translation comments and Line information (both useful for translators to find the string origin).

Ghabry commented 3 years ago

I'm emitting now the following human readable info lines (also powered by ForEach, very nice API :)).

When there is a Choice the following is emitted:


I consider adding a Has Face info as this alters the message width but this will only work for very basic cases and has a high false positive rate.

Ghabry commented 3 years ago

Aaand here is the Update feature. It is very naive:

I just loads an existing Po file and then checks if

  1. The Po entry is translated
  2. The combination of "context+original" exists.

When 2 is true: Merge When 2 is false: Delete the entry. Write it in a FILENAME.stale.po file.

sorlok commented 3 years ago

Found a small bug. Previously, running on The Blue Contestant, for skills, I get:

#. Skill 13: Description
msgctxt "skill.description"
msgid "A spell of dust that causes poison"
msgstr ""

On the latest commits, I get:

#. ID 13
msgctxt ""<0x00><0x00><0x00><0x00><0x00><0x00>.description"
msgid "A spell of dust that causes poison"
msgstr ""

The context is garbled (should be "skill.description").

Ghabry commented 3 years ago

@sorlok I can't reproduce this, also the one on the PR builder is fine.

Try a clean rebuild of liblcf again, then compile lcftrans again (or fetch the binary from the PR builder - just expand the "All checks have passed" message)

sorlok commented 3 years ago

Clean build (of everything) didn't work, but the build artifact did. No idea what's wrong with my toolchain, but this is clearly a "me" problem. Please disregard.

sorlok commented 3 years ago

Ok, hopefully not a false positive: using the latest build artifact on Yume2kki, I get stuff like this:

msgctxt "terms.load_game_message"
msgid "<81><9f><82>Ç<82>Ì<83>t<83>@<83>C<83><8b><82>ð<93>Ç<82>Ý<82>Ü<82>·<82>©<81>H"
msgstr ""

I would expect this to show the actual Japanese text, but it looks like it's doing some mixed encoding (things <0x7F are shown directly, and things >0x80 are shown as ). Is this how a .po file supposed to look?

Ghabry commented 3 years ago

You must specify the encoding (932 for Japanese) as second parameter. (See the help)

sorlok commented 3 years ago

With the encoding specified, it works.

Out of curiosity, is there any reason why the normal EasyRPG encoding detection isn't used here?

Ghabry commented 3 years ago

No idea I could use it here because the database is parsed. Will add it.... Note that the detection will be worse than in player. The player does more tricks liblcf alone can't do ^^

Ghabry commented 3 years ago

@sorlok Added basic encoding detection (usual logic: Command line wins over RPG_RT.ini wins over autodetect).

As a side effect the file list is now sorted because I have to read the entire dir once anyway to find the DB and INI file ;).

sorlok commented 3 years ago

@Ghabry , ok, I think I have a legitimate issue this time. I'm working on "Aurora's Tear", a well-known German RPG Maker 2000 game. Using the latest LcfTrans binary and the "ibm-5348_P100-1997" encoding (which is what EasyRPG detects), most things work, but I get the following for skill ID 71:

#. ID 71
msgctxt "skills.using_message1"
msgid "‚̓eƒŒƒ|<0x81>[ƒg‚ð<0x8f>¥‚¦‚½<0x81>I"
msgstr ""

The <0xID> is just how my editor shows it (it's the extended ASCII character code). My editor is capable of showing German text just fine (and note the garbled mess before the code points).

This is on Demo 0.2 of Aurora's Tear, which you can find here: https://rmarchiv.de/games/112

fdelapena commented 3 years ago

It actually looks like an untranslated Japanese mojibake because this was too familiar while translating and browsing shift_jis sites years ago rendered as iso-8859-1. If you try a search of ‚̓eƒŒƒ| you will get mainly Japanese written sites from search results.

The best bet is trying to check this ID by loading the original game with the original RPG Maker editor. If you open the game with the original Japanese editor (or the trial version) with AppLocale or similar in shift_jis mode, it likely will render Japanese for that ID.

This was pretty usual in some RPG Maker unofficial database translations (RPG_RT.ldb.dat). I've found some strings in Spanish games which messed the encoding detector we used early days, so we needed to skip some texts because the detection heuristics were unreliable with those.

sorlok commented 3 years ago

Hmm... interesting. In that case, does it make sense for LcfTrans to just skip those strings as well? Or is this something the translator (person) will simply remove when they see it?

gadesx commented 3 years ago

There's another tool that can translate even the database.

RpgRewriter https://m.vk.com/topic-42514073_30426599

Ghabry commented 3 years ago

When you find any common mojibake I could add a skip to the tool for this but as fdela said these are just untranslated DB strings so this is normal. For 2k3 this is usually even worse iirc even the official English database has these (but I haven't added skipping of terms that are not used in 2k3 yet)

sorlok commented 3 years ago

Ok, I think my general feeling here is that this is just something the translator will have to deal with (mostly by ignoring it or stripping the text, since presumably this text is never used in-game).

@gadesx, thanks for the link, but we're working doing message translation on-the-fly.

gadesx commented 3 years ago

Rpgrewriter seems to be used by Russian community, as Dreamaker, some friends found this tool to make a translation of a game (FF-Before crisis remake) (they're no using rpg maker). All the scripts are extracted as txt, to reinsert later there is possible broken content. Obviously there's interest to make translations. Maybe the way to edit each file can help

Ghabry commented 3 years ago

Ah, RPG Rewriter is the tool vgperson uses for translating :). Checking other tools is not a bad idea. I havn't fully looked at this tool yet but it also has additional features like file-renaming. This makes sense for there tool because RPG_RT is not encoding-aware, so the files must be renamed from e.g. Japanese to ASCII. Though Player does not care, so this is not a feature we need here. And I don't want to provide a feature that writes back. Is a use-case I personally have no use for (Patches welcome) - Don't forget that we want to move the users to our "Product" :)

sorlok commented 3 years ago

Writing the patched Translation back to disk is actually kind of interesting from a technical standpoint, since it would require us to fix Messages at patch time (not in Game_Interpreter), which would involve modifying the event command list. Thus, the Player would be better placed to produce a patched game than LcfTrans. Anyway, I prefer the flexibility of translating on the fly with Player, especially for games getting new content, since a 0.2 patch can (generally) be used with 0.3. (And it seems RPG Rewriter already fills the needs of hard patching games.) So I'm calling it out of scope for now.

Ghabry commented 3 years ago

@sorlok Did you find any case by now where this translation generator gives bad output that is hard to translate in Player? (or anything else that is problematic)

sorlok commented 3 years ago

Tool has been great to use so far; very accurate and the breakdown of files makes sense.

One minor discussion point: does it make sense to put the "Choice" options into their own stanzas? Consider the following introduction text:

#. ID 1, Page 1, Line 4, Pos (9,7)
#. Choice starting at line 2 (2 options)
msgid ""
"              Intro überspringen?\n"
"                  Nein\n"
"                  Ja"
msgstr ""
"              Skip the introduction?\n"
"                 No\n"
"                 Yes"

Right now, I have to be careful that my translation doesn't go over 2 lines. What if I want to do this:

msgid ""
"              Intro überspringen?\n"
"                  Nein\n"
"                  Ja"
msgstr ""
"Original translation by RPG Advocate\n"
"Updated for EasyRPG by Someone\n"
"              Skip the introduction?\n"
"                 No\n"
"                 Yes"

Oops, now I have to do complicated logic to figure out how to split these lines. Since I'm rewriting EventCommands, it's actually way easier to just throw in some ShowMessage_2 commands, and deal with the ChoiceOptions on their own. I'd recommend:

#. ID 1, Page 1, Line 4, Pos (9,7)
msgid ""
"              Intro überspringen?"
msgstr ""
"              Skip the introduction?"

#. ID 1, Page 1, Line 5
#. Choice starting at line 2 (2 options)
msgid ""
"                  Nein\n"
"                  Ja"
msgstr ""
"                 No\n"
"                 Yes"

I think this is easier for the translator, too, since they only need to know the RPG Maker logic for breaking up MsgBox and Choices, not that AND my crazy logic for it.

=========

So far only one weird thing; in the following Map, from Aurora's Tear: https://drive.google.com/file/d/1cArjwCIASNPjymWyT3JmFIE2BNi7fYCj/view?usp=sharing

...the following line appears when I run LcfTrans:

#. ID 15, Page 1, Line 5, Pos (17,5)
msgid ""
"\\C[19]\"Kläff, Kläff!!\""

This initially looked weird because it used multi-line syntax but was a single line, so I rewrote it as:

msgid "\\C[19]\"Kläff, Kläff!!\""

However, it wasn't matching in-game. So I did a debug trace, and figured out it needed to be written like so:

msgid "\\C[19]\"Kläff, Kläff!!\"\n"

...before it would translate. I.e., there needed to be a spurious newline at the end.

I'm not sure if this is in my parsing code or in LcfTrans, but it might be that LcfTrans discards single lines that are empty. (I didn't think RPG Maker allowed that to be a valid input, but that's a different discussion.)

Can you have a quick look at Map9 and see if it's supposed to be generated with a newline at the end?

Thanks.

Ghabry commented 3 years ago

One minor discussion point: does it make sense to put the "Choice" options into their own stanzas

I already added a hint about choices and where they start in the message box but I see that this is inconvenient from Player PoV. Will change this.


Can you have a quick look at Map9 and see if it's supposed to be generated with a newline at the end

This event has an empty "ShowMessage_2" at the end. I accept this as a bug in lcftrans, the last line is empty but the output must be

msgid ""
"\\C[19]\"Kläff, Kläff!!\"\n"
""
msgstr ""

or better

msgid ""
"\\C[19]\"Kläff, Kläff!!\"\n"
msgstr ""

Otherwise you need extra logic to match this pattern.

sorlok commented 3 years ago

Yeah, I think either option you listed for LcfTrans output of the aberrant message makes sense. Not sure how an empty ShowMessage2 got in there though (maybe something specific to typing German text in RPG Maker?).

Ghabry commented 3 years ago

You can just press enter while typing the message to insert new lines. This feature makes sense otherwise you couldn't do stuff like


Message "Select something"
Messa_2 ""
Choice  "Choice A"
Choice  "Choice B"
sorlok commented 3 years ago

Ah, got it, thanks!

sorlok commented 3 years ago

Just wanted to check in and see what the status is on this.

Ghabry commented 3 years ago

Interestingly the editor doesn't allow an empty line on line 4. Which confirms that this is for aligning num input or choices :)

@sorlok should work now. For some unrelated reason Windows PR build fails though.

Choices are splitted.

Editor input:

@> Text: A
 :         : B
 :         : 
@> Text: C
 :         : D
@> Text: E
 :         : 
 :         : 
@> Text: 
 :         : 
 :         : F
@> Text: 
 :         : 
 :         : 
@> Text: XXX
@> Text: X
 :         : Y
 :         : Z

Tool output:

#. ID 2, Page 1, Line 1, Pos (2,5)
msgid ""
"A\n"
"B\n"
""
msgstr ""

#. ID 2, Page 1, Line 4, Pos (2,5)
msgid ""
"C\n"
"D"
msgstr ""

#. ID 2, Page 1, Line 6, Pos (2,5)
msgid ""
"E\n"
"\n"
""
msgstr ""

#. ID 2, Page 1, Line 9, Pos (2,5)
msgid ""
"\n"
"\n"
"F"
msgstr ""

#. ID 2, Page 1, Line 15, Pos (2,5)
msgid "XXX"
msgstr ""

#. ID 2, Page 1, Line 16, Pos (2,5)
msgid ""
"X\n"
"Y\n"
"Z"
msgstr ""
Ghabry commented 3 years ago

Imo @Sorlok should have the final word here. When we get an agree from you that the format is fine now this can be merged :)

sorlok commented 3 years ago

Thanks @ghabry. The sample text you provided look good.

I'm running through and updating the Aurora's Tear translation and checking, just to make sure nothing else stands out. Once that's done I'll let you know!

sorlok commented 3 years ago

It looks good. I actually used the "-u" flag on my current Aurora's Tear translation, and it showed me exactly what changed. This can be merged, as far as I'm concerned.

Ghabry commented 3 years ago

Thanks for also independently testing the "-u" flag, I always only used artificial translations to test it.