Closed genjonakasone closed 8 months ago
I should correct myself - Windows 1250 supports Polish symbols but not in True Stalker/OpenXRay.
The "encoding=" entry in beginning of the xml file does not matter, symbols will be displayed the same way no matter if there's windows-1250, 1251, ISO 8859-2 etc.
Saving the file with Windows 1250 encoding in Visual Studio Code keeps the Polish-specific letters, like for example "ROZDZIELCZOŚĆ", as it should be, but in game it turns into "ROZDZIELCZOĆ", same happens to the letters "Ł" and "Ż". I've seen this thread in OpenXRay repo about UTF-8 and other encodings but does not bring any consensus.
Now it makes me think, is it possible that True Stalker is simply missing a font that has all Polish letters in it? Any recommendations on how to approach finding the original font, checking whether it's okay or not and eventually how to replace it? It's my first time working with OpenXRay.
I'm open for any advice.
Ok, I figured it out, the default font does not support some of Polish diacritics. A bandaid solution I did was to slap Stalker Gamma fonts into the True Stalker's gamedata and edited the fonts.ltx changing Roboto fonts to whatever from Gamma (in my case it's Letter font, ui_font_letter_XX_XXXX.cent.dds). I will keep translating the files and make a PR with just the xmls, regarding fonts I will leave it up to you.
Wow, that is great news! If there's something missing in the default fonts, let's add the font from Gamma for now (if it helps).
In the meantime I've been exploring the options regarding the organisation of the repository and came to a conclusion that storing source files in unicode is the cleanest possible way (which additionally benefits the contributors who prefer Web IDE, since it very much likes to screw with the non-UTF-8 encodings...)
Since, however, OpenXRay does not yet support it, I've designed a super basic converter that would prepare files respecting the needed encoding and place them into "releases".
Feel free to give it a try: just put your sources into gamedata_UTF-8
and in a few moments you can collect the 7z archive language pack in Polish with correct encoding and all the prefixes preconfigured.
Also pardon me for organising all that a little chaotically. Lots of ideas on how to improve stuff, not everything was ready from the start. Still some essential bits, such as validating XML files before packing them need to be implemented etc.
After a few hours more of playing with Gamma fonts it still looks like there is something wrong, not sure if it's about wrong size of the DDS file or True Stalker reads the symbols in a different manner. I've heard OGSR community made a font generator to support other languages, here's a link for it, unfortunately I do not grasp the idea of how to properly use it - https://github.com/OGSR/Fonts_generator
I also found exported True Stalekr's XML files on C-Consciousness Discord server, they seem to be by default encoded in UTF-8 with encoding="UTF-8" parameter in XML and they work, so now I am not sure whether they really have to be transcoded into windows-125X. When I did it manually the game would not start, with these it just works. Archive includes the game's fonts too in case it would be possible to convert them to support more symbols than they do now.
After a few hours more of playing with Gamma fonts it still looks like there is something wrong, not sure if it's about wrong size of the DDS file or True Stalker reads the symbols in a different manner. I've heard OGSR community made a font generator to support other languages, here's a link for it, unfortunately I do not grasp the idea of how to properly use it - https://github.com/OGSR/Fonts_generator
I also found exported True Stalekr's XML files on C-Consciousness Discord server, they seem to be by default encoded in UTF-8 with encoding="UTF-8" parameter in XML and they work, so now I am not sure whether they really have to be transcoded into windows-125X. When I did it manually the game would not start, with these it just works. Archive includes the game's fonts too in case it would be possible to convert them to support more symbols than they do now.
Thanks. Will take a look at those later.
I received fixed font files from the True Stalker's dev, attached them below. These will be added to the mod in the next patch, but its release date is unknown. I had to change the files encoding + xml headers encoding to ISO 8859-2, now all Polish diacritics are shown as they should be. Windows-125X does not display them at all, neither UTF-8, which makes me wonder since these encodings are supposed to support Polish language. TS_font_update.zip
Interesting... So, it appears, whatever was in xml files' declaration string was wrong initially. Prefixes and all this mess. I hope, OpenXRay devs are able to figure out the unicode way of doing things soon
@genjonakasone, used some default strings from Call of Pripyat and compiled a Polish language pack demo with the fonts you provided:
Sources in testing branch: https://github.com/true-community/true-localisation/tree/testing/gamedata_UTF-8/configs/text/pol (you will see a lot of placeholders in game, since only default CoP strings are translated) Final archive of gamedata: https://github.com/true-community/true-artifacts/tree/releases/localisation/testing
Does it look ok?
The second from top option "Ustawienia jakoci" should be "Ustawienia jakości", it's missing the "ś" letter. Not sure if that's the font's fault or the encoding. Mine with ISO-8859-2 does not seem to escape or "loose" any Polish diacritics, albeit I haven't tried it with the CoP font.
First option in the selector should say "Pełne ośw. dynamiczne" so it's not displaying the characters either. When I open the ui.st.mm.xml file with Notepad++, it defaults to the header encoding (so windows-1251) and it looks like this: So I'd say that at least for Polish they should be translated as ISO-8859-2 (because the diacritics don't turn into "shrubs") if there's no UTF-8 support.
Aaah, okay, I did not change the encoding in https://github.com/true-community/true-localisation/blob/staging/language_selector.json#L26 and the output was in 1250 again. Let's try changing it this time and see if result is any different
Nope, with ISO-8859-2 the game does not even launch
What if you change the file format from Windows CRLF to Unix LF? Does your game run if you replace the ui_st_mm.xml file with the one I have? Maybe not all files can be displayed in ISO-8859-2 because of some engine's quirk. Some files open as UTF-8, some as UTF-8-BOM by default for me. ui_st_mm.zip
What if you change the file format from Windows CRLF to Unix LF
Highly doubt that would change much, but we could try that of course.
... or better yet, you can try it yourself :) I've given you access to repository so go ahead and make necessary changes in the testing branch.
change the encoding in https://github.com/true-community/true-localisation/blob/staging/language_selector.json#L26
This file configures the encoding to convert sources into, and has prefixes, as you probably noticed.
And I suggest you to use VS Code for heavier editing as Notepad++ may be somewhat basic as IDE (not sure about now, but last time I checked it was).
Or, if you do not want to wait for pipeline to compile stuff, you could just edit the final gamedata and make a note on success here.
I need to call it a day. Need to have some rest before workday
Also the original files from CoP are already present in True Stalker's resources and, weirdly, those have west prefix O_o
So my only assumption here is that font is to blame. Didn't have a chance to try the font compiler you suggested, maybe that would solve all issues.
But, off the record, my brutally honest thought would be some unbroken things should remain unfixed. The game wouldn't turn out any worse with original fonts. I actually would never have noticed they were replaced at all, but here we are looking for a fix for the fix 😅
I tried multiple encodings on the test files from the polski.7z archive you uploaded. I have written the sentences how they should be displayed with red font for comparison with what True Stalker displays. Here are the results:
windows-1251
windows-1250
iso-8859-2
So the last one is the only one that gets the symbols correct. Not sure why the game wouldn't start with ISO on your PC, unless the headers were set wrong or maybe the compatibility is added with one of the four patches that were released so far. I've seen Czech translation posted on ModDB, it's also encoded in iso-8859-2 and it also worked for me, albeit they used the eng folder instead of custom "ces" or however it would be called. Chinese translation is encoded in UTF-8, but they used the compiler, and apparently the engine "will switch to unicode mode when texts are saved in UTF-8 and there are multi-byte fonts available".
And it is font's issue indeed. I do not know how the DDS letter-finding system works and despite me trying I couldn't get the compiler to work.
the engine "will switch to unicode mode when texts are saved in UTF-8 and there are multi-byte fonts available"
If that is so, getting the correct font is a solution to all the problems and prefixes would not even be needed anymore!
@genjonakasone, now it builds correctly UTF-8 > ISO-8859-2, I had to edit the encoding converter command a bit. You were right about the encoding's choice and also the fonts did help, I think.
Anyway, to get the language pack ready we should add a bunch of strings :) I am thinking of automating the filling of missing strings via the DeepL or alternative but need to investigate the subject. Closing the issue :partying_face:
So I am currently working on a Polish translation for the True Stalker mod, and it looks like that - in contrary to what the readme says - Windows 1250 with "font_prefix = _cent" entry in localization.ltx file does not make the Polish letters to show up. So for example word "Wyjdź" in main menu is shown as "Wyjd ".