Tuxemon / Tuxemon

Open source monster-fighting RPG.
GNU General Public License v3.0
961 stars 201 forks source link

Split translation PO files into multiple files within the same translation #600

Open MirceaKitsune opened 4 years ago

MirceaKitsune commented 4 years ago

I believe the way messages and translations are handled is rather limiting, due to relying on single files for everything and implementing them in multiple formats. At the moment all strings are stored in one file called tuxemon/resources/l18n/*/LC_MESSAGES/base.po... including descriptions of builtin commands, monsters items and NPC's, as well as conversations specific to the story.

This will particularly be a problem for custom games, who need to copy every translation and append their own custom monsters and conversations at the end. If they don't use those files, it won't be possible to use translated_dialog to implement conversations efficiently. They can't remove the builtins either as they're overriding the entire file so this would probably cause crashes.

Another issue I'm seeing with translations is that they appear to be implemented twice in different ways: There's also a tuxemon/resources/db/locale/*.json which seems to do the same thing, only in json rather than the format used by the po files. It's unclear which one of them is even obligatory for custom dialogue to work, or if everyone needs to create both a json and a po.

I'd suggest having the locale function scan for and read all *.po files it finds inside LC_MESSAGES instead of a single base.po containing everything, then splitting all groups into a few different files. Perhaps we can also get rid of the seemingly needless LC_MESSAGES subdirectory in the process? Also we should probably choose whether to use tuxemon/resources/db/locale or tuxemon/resources/l18n, having both feels like duplicated functionality unless I'm missing something.

bitcraft commented 4 years ago

the new translation system was implemented by AndyMender, and im personally not very familiar with it. there are a few things to consider though. i agree that the "all in a single file" is not ideal. the "po" files are more like source code. gettext will read and compiles them into "mo" files when the game is started. the "mo" files are used by gettext for translation. I think LC_MESSAGES is just a part of the gettext system and i don't know if it can be removed or changed. id probably feel best if it was left alone for now.

the json stuff is obsolete. its just around now in case somebody wants to reference old translations. not all the existing translations were migrated to gettext/po files. so that folder will stick around until all translations are updated.

right now, i don't know if there is a huge benefit to splitting the translations. consider that for each language we support, we need to maintain files for each. i feel like the burden of managing all the files, the code to aggrigate them, to check if all the messages are present, etc outweighs any advantage to split them [right now].

MirceaKitsune commented 4 years ago

I see, thanks for clarifying. I wouldn't split it into too many files, but would do at least 3 categories at this point: Internal translations for base features (eg: main menu stuff), one for the names and descriptions of things defined by databases (npc's, monsters, techniques, items), and the other to store only dialog content (sign text and NPC conversations). I feel this would be a little easier to handle, and not too hard to maintain since it's just a few categories and you easily know which to go to.

The json ones I'd perhaps move to a deprecated subdirectory, to make that more clear for others who could be confused by what they mean.

bitcraft commented 4 years ago

The json ones I'd perhaps move to a deprecated subdirectory, to make that more clear for others who could be confused by what they mean.

sure. we could make a new top-level folder called "deprecated" and put the json translations there. new, separate pr for that move please.

im still not convinced that we need to split the locale files. the idea that i as a translator would have to break down translations into arbitrary groups seems like a headache. consider the word "OK". does that go with the menus? should it go in dialogs? having several files means its more likely to have a name collision with the labels; ie defining the same string in multiple places. too much cognitive load for little benefit. a simple flat file is easy to find strings, copy/paste, search/replace, and edit. arbitrary labeling of data files is an anti-pattern, and one i want to avoid.

there is an issue with locale right now anyway that blocks distribute on tuxemon on some platforms and i really don't want to have two issues working on the same system at once. lets get the "building translations on game startup breaks the game when it is installed by a package manager" issue fixed, then revisit this.

i think you are missing a better alternative which is simply being able to load many translations at once, which a search in each translation in a specific order. this would solve the issue of mods defining their own strings. i support this idea, but not that we need to split translation files into "menu items", "monsters", etc.

lets not get caught up on this detail and focus on features that benefit the game.

MirceaKitsune commented 4 years ago

I see your point. My main issue was the thought that whenever someone makes a custom game with its own dialogue, they'll have to copy the entire translation file including builtin engine translations... if a new one is added by the code, they'll need to manually copy-paste those changes into their version. Such a large file is also a little harder to go through and find things, although having comments indicating where a category stops and another starts helps with that a bit.

Perhaps this is indeed something that shouldn't be messed with for now. Maybe later we can code just the ability to load all translation files in the directory, but keep the default ones unchanged: That way if a game developer wishes to use this model, they can but the normal definitions stay the same.

bitcraft commented 4 years ago

I'll open a new issue related to mods which allows multiple translations to be loaded. This will be different from "splitting a translation into multiple files organized by category".

bitcraft commented 4 years ago

Discussion for "translation mod support": #602

xirsoi commented 4 years ago

I think that, if the engine is capable/willing to load arbitrary locale files it will solve Mircea's issues. Other game projects could then organize their localizations in whatever arbitrary way suits their needs.

I don't know if that's already the case, or if that will be addressed by #602 , but I think it'll do what they want.