L-Sherry / Localize-me

CCLoader mod to add locales
MIT License
5 stars 3 forks source link

[Idea] Somehow fallback to English strings on missing keys in lang/sc/*.en_US.json in default non-English locales. #7

Closed dmitmel closed 4 years ago

dmitmel commented 4 years ago

(I hope github will let me fit all of this into a single message)

The discussion starts here on the discord crosscode modding server, copying my messages here as you don't have reliable access to discord.

I should probably explain what I meant by introducing better localization facilities \ firstly, the context \ CC in its current state is a bad example of how you localize software \ CC has two localization systems in place \ one correct, and the other one incorrect and unnecessary, Stefan/R.D. even said that they won't be using it in the next game IIRC \ oh, and before I get into the details \ it should be noted that if not for bitmap fonts + Asian languages, it is easy to make language switching work on the fly \ instead of strings for text labels in this case you store localized strings in memory \ or references/unique identifiers to localized strings \ which are handled by text label GUI widgets in such a way that either the string is picked for the current language (first approach), or they are loaded from the disk and accessed as keys in a hashtable (second approach) \ oh, and "localized string" in this context is a table of locale names to translations \ e.g.

{
  "en_US": "Sandwich",
  "de_DE": "Sandwich",
  "fr_FR": "undefined",
  "zh_CN": "三明治",
  "ja_JP": "サンドイッチ",
  "ko_KR": "샌드위치",
  "langUid": 3
}

forget about langUid, it is unused in practice in CC's codebase \ so, returning to localization systems in CC \ the correct one stores translations in assets/data/lang/sc/<name>.<locale>.json \ each subsystem has its own JSON per every language

gimmick.de_DE.json
gimmick.en_US.json
gimmick.ja_JP.json
gimmick.ko_KR.json
gimmick.zh_CN.json
gui.de_DE.json
gui.en_US.json
gui.ja_JP.json
gui.ko_KR.json
gui.zh_CN.json
map-content.de_DE.json
map-content.en_US.json
map-content.ja_JP.json
map-content.ko_KR.json
map-content.zh_CN.json

translation strings are literally the only things these files contain \ in the game they can be accessed with ig.lang.get("sc.gui.path.to.a.variable") \ or ig.lang.labels.sc.gui.path.to.a.variable \ sadly, this is used only for GUI labels \ and not for everything by the way \ the "round" label when PvP starts e.g. is just a string literal in the source code \ so while it is correct in the cases of English and German, in Asian languages it still stays as "round" \ oh, and Satcher and I refer to these files as "lang files" \ the second localization system CC uses is embedding localized strings into event steps in map files \ we call this "lang labels" \ this is one of the major tasks of localize me, finding and adding locale keys to lang labels with translations received from translation packs \ pain in the ass to deal with and distribute translations for \ and this wasn't necessary because they have their map and event steps editors \ which can handle extracting localized strings to a separate file and keeping references to them internally \ so, how does all of this apply in the context of modding? \ firstly, localize me is ready to deal with the embedding approach with minor tweaks in the URL resolution code \ so mods which add custom maps and dialogue (looking mostly at Eisus) can have it translated into a few locales internally, \ and localize me-based mods (cc-ru and french-cc) can provide translation packs for missing locales \ although in our context we can just open a PR against a target mod \ the problems come from the first approach \ i.e. lang files \ so you see, let's say I'm writing the jetpack mod \ and I create an object-merge patch file named assets/lang/sc/gui.en_US.json in the mod's directory

{
  "labels": {
    "options": {
      "controls": {
        "keys": {
          "jump": "Jump"
        }
      },
      "headers": {
        "jetpack": "Jetpack"
      }
    }
  }
}

I launch the game and this will just work because this merges my modded labels with the default ones \ in the locations options-related code refers to \ let's say I was playing with English to test this \ then I switch to Russian \ localize me coincidentally runs after all patches are applied, so it will read the entire language file plus my object patches \ then it will take a look into translation packs while patching this whole tree of gui.en_US.json, see that there are no translations for these exact keys and move on \ and then suppose Eisus installs the jetpack mod \ it is known that Eisus plays with Korean \ however, here's where the problem comes from \ localize me will always load gui.en_US.json \ because it works by loading English lang files and patching them by design \ however, the stock locales don't work in this way \ all lang files have the equivalent structure \ with an equivalent number of object keys and array elements \ so when using the jetpack mod on a non-English and non-modded locale it will just display "UNKOWN LABEL" strings in the settings \ as such, I have two potential solutions for this problem \ 1. patch the game in a way that when ig.currentLang is not English English lang files should be loaded and non-English ones should be merged over them \ that way untranslated labels remain in English \ also: English is considered the default language by CC's code \ and localize me will deprecate support for using non-English languages as the source for translations in 1.x \ it should also be noted that some analysis of ig.Lang and localize me is needed to make the merging condition work reliably, but that's not a problem \ the second solution: \ 2. load modded GUI localizations separately, then merge them into ig.lang.labels on load \ well, I don't really like this \ because under the hood it is the same as the first approach, but needlessly complicated \ oh, and let me present the current crappy solutions \ 3. modify ig.lang.labels in main/poststart \ can't be handled by localize-me, required hacks in cc-ru to support \ 4. duplicate the gui.en_US.json file for other locales \ the problem with this is that the author can forget to update all of the locales \ 5. make symlinks to English lang files for each locale \ this doesn't support partial translations and will work reasonably well only in small mods (such as jetpack)

To solve the issue at hand I propose loading lang/sc/*.en_US.json for each corresponding lang/sc/*.<BROKEN_LOCALE>.json, first copying contents of the English file into ig.Lang's labels field, then merging the labels of the BROKEN_LOCALE over that. English, of course, is assumed to be the default language here as ig.LangLabel does so as well. AFAIK BROKEN_LOCALE can refer only to de_DE, ko_KR, ja_JP and zh_CN because locales added with Localize Me implicitly support the needed behavior due to the way Localize Me works (especially after the 1.0 release when you'll be able to translate only from English). What do you think about this in general and in terms of how to implement this? Also: is it possible to add the builtin locales through Localize Me-based mods? Though I feel this shouldn't be a problem in this case.

L-Sherry commented 4 years ago

you don't have reliable access to discord. These days, you can say i don't have access at all.

and localize me will deprecate support for using non-English languages as the source for translations in 1.x

i didn't plan to do that. I only stopped relying on from_locale to find langlabels, so all langlabels in the game gotta have an en_US or a langUid, or localize-me won't find them. Forcing from_locale to en_US doesn't help much. Defaulting it could help, though.

And otherwise, i don't get it. Localize-Me basically attempts to disables itself as soon as it detects a non-modded locale, and runs away. And i don't see any reason why eisus would have localize-me installed or have anything depend on it. The one thing that happened since the 0.5 refactor, is that putting stuff inside sc.LANG_DETAILS.en_US.map_file will make localize-me use it.

To me, the thing that patches the langfile is in the best position to handle it. Like the modloader already knows about every patch that touch langfiles and could at least tell it to some mod that nobody will install anyway. so i'm a bit undecided..

dmitmel commented 4 years ago

You misunderstood me. This isn't meant to be a part of Localize Me, I'm asking you for ideas on how to implement such behavior in a general injection which will be included in the CCLoader directly.

Edit: actually, I forgot to specify this, but whatever.

dmitmel commented 4 years ago

Err, not really for ideas (I know how this can be theoretically implemented, but haven't tried out doing that yet), but for suggestions on how to implement this in a way that wouldn't conflict with Localize Me. Anyway, my current plan is to just use the new and shiny scripted JSON patching system (which was added two days ago) and when a request to any of the lang files of built-in non-English locales is intercepted, fire a request to a corresponding English lang file and merge the received data on top of the English data.

L-Sherry commented 4 years ago

If this is not a localize-me problem then why is this a localize-me issue :p

oh, and Satcher and I refer to these files as "lang files"

well, their doctype is literally STATIC-LANG-FILE, no getting around that.

so while it is correct in the cases of English and German, in Asian languages it still stays as "round"

I'm not a native german speaker, but "round" does not seems correct, and wikipedia agrees, the translation might be something like Runde. it's correct in french, through. That's something i should implement in localize-me, when i get the time. But hey, considering that the german translation does not even translate combat arts...

And for the issue at hand, well, since the modloader knows about all patches, i suggest to includes patches for the wrong langfile if there is no patch for the right one. That's maybe your option 2, if i understand correctly.

Like if i'm running in de_DE, and there is this mod with a gui.en_US.json.patch but no gui.de_DE.json.patch, then apply the one for en_US instead. And let's not limit it to en_US; If there is a great mod with a gui.ko_KR.json.patch and a gui.zh_CN.json.patch, then pick one of them.

It won't allow mods to have their own ru_RU translation, through, because localize-me won't even attempt to load it by design.

AFAIK BROKEN_LOCALE can refer only to de_DE, ko_KR, ja_JP and zh_CN

Hardcoding locales is bad. What if RFG adds a new language ? Because i suspect that a new language is coming very soon. I mean, just look at this and peek at that directory list view...

dmitmel commented 4 years ago

If this is not a localize-me problem then why is this a localize-me issue :p

Because I prefer Github issues over email ¯\_(ツ)_/¯. After all the ticket is labeled "idea".

Hardcoding locales is bad. What if RFG adds a new language ?

The alternative is to iterate over the keys of ig.LANG_DETAILS before Localize Me adds its own locale, this is easily doable. However, the point of asking that question is more about whether a broken locale can be added through Localize Me, which to me seems not, so I asked you directly in Localize Me's bugtracker.

dmitmel commented 4 years ago

Finished implementing this in CCLoader, as such I'm closing this ticket. Thanks for help!