Closed imKota closed 3 years ago
Unfortunately, I can't help you when it comes to changing the default font. I know that it is possible to specify fonts in LiveMaker, since the original LM documentation has the tags to do it. But I have no samples of games that include their own fonts and that use the font tags, so I never bothered trying to reverse engineer/figure out the parts of the engine related to setting fonts in an LSB script section.
Also, the LiveMaker engine is hard coded to only use installed Windows system fonts which are flagged with support for Windows Shift-JIS (MS CP932), so as far as I know you cannot get a game to use any locale/encoding other than CP932.
It is at least possible to force LiveMaker 3 games to display half-width western characters instead of full-width by changing the value of PR_FONTCHANGEABLED
for the in game text boxes, which should make your patched text look nicer in-game: https://pylivemaker.readthedocs.io/en/latest/usage.html#notes-for-translation-patches
It is at least possible to force LiveMaker 3 games to display half-width western characters instead of full-width by changing the value of
PR_FONTCHANGEABLED
for the in game text boxes, which should make your patched text look nicer in-game: https://pylivemaker.readthedocs.io/en/latest/usage.html#notes-for-translation-patches
This is strange, but it did not help
Hmm, maybe that parameter doesn't affect Cyrillic characters (I've never tested that case)? Can you try patching in some text with Latin characters just to make sure whether or not that is the issue?
It may also be that you need to edit that parameter for a different (or all) of the MesNew
instances in that LSB file. Depending on the game, it's possible that the command 36 is not actually the message box corresponding to your in-game text.
in file メッセージボックス作成.lsb lines 8, 36, 61 have already been set to 0
Yeah, so the font width parameter is working since the english text shows up correctly. It may just be that the engine is hard coded to only do the half-width adjustment for ASCII characters then. I'll try taking a look into the actual engine code at some point, but I'm guessing there isn't much that can be done about the problem.
@pmrowla any news?
Yeah, so the font width parameter is working since the english text shows up correctly. It may just be that the engine is hard coded to only do the half-width adjustment for ASCII characters then. I'll try taking a look into the actual engine code at some point, but I'm guessing there isn't much that can be done about the problem.
@imKota I haven't spent too much time on reversing more of the engine code, but I'm pretty sure the answer is still that there's not much you can do about it.
@pmrowla
@imKota I haven't spent too much time on reversing more of the engine code, but I'm pretty sure the answer is still that there's not much you can do about it.
It's strange.. If run through NTLEA, then the font is displayed normally.
I'm not familiar with how NTLEA works, but I'm assuming they are hooking things at the windows api level? So you could try patching whatever calls NTLEA is hooking in your game exe. But reversing mostly takes a lot of time that I don't have at the moment, as I am busy with other work. But my ghidra project is available on the wiki: https://github.com/pmrowla/pylivemaker/wiki so someone else is welcome to look into it as well
I examined the game engine with IDA. I noticed this here: This constant is used in several places in Delphi VCL lib, and passed once as a reference. The number 932 (3A4h) reappears once again: This constant is used in several font functions.
I would try to set this both constants to an other codepage, translate a game to this codepage and show what happens. My problem is: the only game I have is HUGE. I cannot do a full translation yet. Do one of you have a very small example project?
@Stefan311 I have a small game that I made myself following the LM tutorial w/around 5 total lines that I use for testing different things in pylm (it's where the lsb's used for the automated tests come from). And it already includes a mix of JP and ascii text lines.
Works! I have changed the values on file position 1777396 (0x161EF4) to 1252 (0x4E4). That's the western europa code page. I also changed the novel.py to encode as CP1252 and bypass CP932 checks.
class _TWdCharAdapter(construct.Adapter):
# construct PaddedString only supports ascii and utf encodings
def _decode(self, obj, ctx, path):
try:
ch = obj.to_bytes(2, byteorder="big").decode("cp932")
except UnicodeEncodeError:
try:
ch = obj.to_bytes(2, byteorder="big").decode("cp1252")
except UnicodeEncodeError:
raise BadLnsError("'{}' is not a valid CP932 or CP1252 character".format(ch))
if ch.startswith("\x00"):
ch = ch[1]
return ch
def _encode(self, obj, ctx, path):
return int.from_bytes(obj.encode("cp1252"), byteorder="big")
Awsome stuff @Stefan311. If we can get this working with utf-16 (I think this will be preferred over utf-8 due to how TWd char by char packing is done), we can make #44 a priority, and just distribute our patched version of the engine for every game patched w/pylm.
I fear we are bound to this old code page crap. but... https://docs.microsoft.com/en-us/windows/win32/intl/code-page-identifiers There is a codepage for utf-8... 65001 Just trying...
also for future reference, this constant is at offset 0x001c30f4
(gvar_001C30F4
) in the shared ghidra project
utf-8 might require some hacks in our struct handling, since LM assumes everything will fit in a single 2-byte wchar. for utf-16, we can split 4-byte codepoints across two TWdChar's, but since utf-8 is variable width it gets a bit more complicated
I see. Maybe code page 1200 / 1201? The M$ help says "available only to managed applications" but maybe...
pages 1200 / 1201 / 65001 does not work :( Error message "Install game again!" Seems we are stuck to code pages
Just to clarify, when you get the error message for utf-8 (65001), is that after you replaced text and set the TWdChar encoding to utf-8?
because with just hexediting the constant to 65001 (but not manipulating any text), that does not crash for me.
I think it may still be possible to make it work, but just setting TWdChar encoding to utf-8 won't be enough, it will have to be manually packed to make sure that LM unpacks the bytes correctly
I got the error message directly after the game starts. lmlsb does not work with "utf-8" as encoding, so I just use "utf-16be" also for the 65001 test. Funny if I use CP1252 in the EXE and utf-15be in the translation I get the german chars displayed, but the japanese "name" text is changed to unreadable latin characters.
Other idea: The code page constant is used to call MultiByteToWideChar. This function convert code paged multibyte text to utf-16. So, if we already store our text as utf-16, and just NOP-out the MultiByteToWideChar call, could this work? Edit: of coarse not only NOP-out the call itself, also need doing the stack-work. Edit2: I am currently reading the MultiByteToWideChar M$ help page. Maybe there is need to change the parameter dwFlags to work properly with utf-8.
I am able to set it to 65001 and run with packed utf-8 text, although it's obviously not being decoded properly in LM with what I'm currently trying:
basically TWdChar will have to be modified to only handle characters as 16-bit ints, and then at a higher level we have to convert them to/from text w/something like
raw = ch.encode("utf-8")
logger.info(f"packing {raw}")
if len(raw) <= 2:
logger.info(f"packing {raw} into one ch")
new_ch = int.from_bytes(raw, byteorder="big")
new_block.append(TWdChar(ch=new_ch, **d))
else:
logger.info(f"packing {raw} into double ch")
new_ch = int.from_bytes(raw[0:2], byteorder="big")
new_block.append(TWdChar(ch=new_ch, **d))
new_ch = int.from_bytes(raw[2:4], byteorder="big")
new_block.append(TWdChar(ch=new_ch, **d))
e: I think maybe we will have to pack things on full string/block level rather than per char?
After patching the dwFlags parameter to 0, the game starts and displays something... Since the translation is still utf-16, there is a space between every latin char, and the japanese chars ar still wrong.
What windows version do you use? This seems the thing where I failed first:
Note For UTF-8 or code page 54936 (GB18030, starting with Windows Vista), dwFlags must be set to either 0 or MB_ERR_INVALID_CHARS. Otherwise, the function fails with ERROR_INVALID_FLAGS.
For the internal pylm utf-8-to-byte encoding I am out. Assembler... no problem... but python? I am not knowing much about python.
I'm testing in windows 10
ok so with it set to 65001, and packing utf-8
by line, I'm able to get the following
where the game is displaying the equivalent of
>>> 'こんにちは'.encode('utf-8').decode('cp932')
'縺薙s縺ォ縺。縺ッ'
I think that it maybe possible to get utf-8 to work, but it's probably not something I will put too much time into (or at least not in the near future).
It is good to know that we can force other non-utf8 codepages to work, but that still doesn't help as far as making a general solution goes (just as an example, imKota would need a version of the engine and pylm that supports CP1251, whereas Stefan311 and LioMajor need 1252). Forcing CP932 isn't ideal, but it at least supports both latin + cyrillic alphabets, even if it lacks support for accented characters.
but obviously, help from anyone else w/time to RE and experiment would be great
So your exe still decodes cp932 while the translation is utf-8. Interesting. Are sure you have set the 65001 correctly?
Could you share the utf-8 encoding patch on pylm?
I pushed my branch https://github.com/pmrowla/pylivemaker/compare/utf-8
Something does not work in this branch
ID,Label,Context,Original text,Translated text
pylm:text:00000001.lsb:8:0,00000003,,"1237112435123951238512399
228246252196214220223
1071321085107732107510861074108610881102321087108632108810911089108910821080",
pylm:text:00000001.lsb:8:1,00000003,,879710511632102111114329910810599107,
pylm:text:00000001.lsb:8:2,00000003,,651021161011143211997105116,
pylm:text:00000001.lsb:8:3,00000003,,84101120116321151121011011003210297115116,
pylm:text:00000001.lsb:8:4,00000003,,841011201163211511210110110032115108111119,
pylm:text:00000001.lsb:8:5,00000003,,84101120116321151121011011003211011111410997108,
I gave up looking for a solution with UTF16. All that's left is UTF8 or classic code pages.
I would vote for classic code pages. Lets write a simple exe-patcher and add an option to chose the LSB encoding.
yeah the branch doesnt decode properly, I didn't add the stuff to extract/unpack from 16-bit ints into utf8. I was only testing inserting/packing utf8.
I gave up looking for a solution with UTF16. All that's left is UTF8 or classic code pages.
I would vote for classic code pages. Lets write a simple exe-patcher and add an option to chose the LSB encoding.
This is probably not something that can go into pylm proper unless we are sure it works for all the possible LM 2 + 3 interpreter versions (or until we have support for #44). But if you get stuff working in a fork for whichever specific engine versions you're looking at, I can link to it from the readme/docs/etc
edit: I added a list of things we will need for a general custom-codepage based solution in the other issue
This game uses an older version (3.12.2.28), this engine version does not contain any code page constant. Seems the code page handling is a later development.
Is the 3.17.12.26 the last engine version?
Yeah, that’s the final release before the company shut down
I machine translated "my" game to german, and found out this:
Could you try your utf8 test again on a english locale machine? (or send me the test file?) I have not managed to got the utf-8 branch to work. (btw. this can be deleted)
On patching localized menu stuff I have a problem on the pylm side:
Patching 000001A1.lsb ...
Translated 2 choices
Failed to translate 0 choices
Ignored 0 untranslated choices
Backing up original LSB.
Could not generate new LSB file: Error in path (building) -> commands -> items
no subconstruct matched: Calc AddArray(_tmp, "Es sollte aufhören")
I am currently stuck on this issue, I have no idea how to debug this. Could you please...? pylm_diff.txt 000001A1.lsb.zip menu.csv.zip
@Stefan311 if you are translating Calc operand fields to use non-CP932 characters (meaning menu text), you need to tell the struct for string literal operand data to use the proper codepage:
this may have unintended side effects - i.e. you will probably have to translate every string literal in your patched LSB files and not just the ones you care about, assuming that japanese characters cannot be encoded w/whatever codepage you are using
wrt to the utf-8 branch, I never got it to work either
- On systems with native japanese locale the patch does not work. Seems the livemaker engine uses two different display engines depending on system locale.
I'm pretty sure this is expected behavior. Because of how windows codepages work, if you are patching a game to use a different locale, it will only work properly if windows is set to use the patch locale (either via the actual windows default locale setting, or via a locale emulator).
so to use CP1252 in a japanese configured windows installation, you'd have to use locale emulator set to CP1252
Seems you haven't understand. I show my test results, maybe you can understand. The game menu is still not translated.
japanese locale system, game engine cp932, original content Window title correct, Text correct, Menu correct.
japanese locale system, game engine cp1252, original content Window title incorrect, Text correct, Menu correct.
japanese locale system, game engine cp932, translated content Window title correct, Text incorrect, Menu correct.
japanese locale system, game engine cp1252, translated content Window title incorrect, Text incorrect, Menu correct.
english locale system, game engine cp932, original content Window title incorrect ("??????"), Text correct, Menu correct.
english locale system, game engine cp1252, original content Window title incorrect, Text incorrect, Menu incorrect.
english locale system, game engine cp932, translated content Window title incorrect ("??????"), Text incorrect, Menu correct.
english locale system, game engine cp1252, translated content Window title incorrect, Text correct, Menu incorrect.
So my assumptions:
hmm I see.
I'm not really surprised that there's issues with regard to changing locale, since the engine itself has hardcoded CP932 strings in it, and we are still mixing CP932 and
this may have unintended side effects - i.e. you will probably have to translate every string literal in your patched LSB files and not just the ones you care about, assuming that japanese characters cannot be encoded w/whatever codepage you are using
def _pascal_string_proxy(construct.Int32ul)
try:
return construct.PascalString(construct.Int32ul, "cp932")
except:
return construct.PascalString(construct.Int32ul, "cp1252")
@classmethod
def _struct(cls):
return construct.Struct(
"type" / construct.Enum(construct.Byte, ParamType),
"value"
/ construct.Switch(
construct.this.type,
{
"Int": construct.Int32sl,
"Float": construct.ExprAdapter(
construct.Bytes(10),
lambda obj, ctx: numpy.frombuffer(obj.rjust(16, b"\x00"), dtype=numpy.longdouble),
lambda obj, ctx: numpy.longdouble(obj).tobytes()[-10:],
),
"Flag": construct.Byte,
"Str": _pascal_string_proxy(construct.Int32ul),
},
# else 'Var' variable name type
construct.Select(construct.PascalString(construct.Int32ul, "cp932"),),
),
)
Is this kind of proxy method possible? To be honest, I still don't understand this whole "construct" thing. I start to really hate this esoteric python stuff.
construct is just a library for packing python types into binary structs. I think it should be possible to do it with an adapter
it could probably also just be a union? But I’m not actually sure how construct handles extracting/unpacking unions
Works this way:
def _struct(cls):
macro = construct.PascalString(construct.Int32ul, "cp932")
def _encode(obj, context, path):
if obj == u"":
return b""
try:
return obj.encode("cp932")
except UnicodeEncodeError:
return obj.encode("cp1252")
macro._encode = _encode
return construct.Struct(
"type" / construct.Enum(construct.Byte, ParamType),
"value"
/ construct.Switch(
construct.this.type,
{
"Int": construct.Int32sl,
"Float": construct.ExprAdapter(
construct.Bytes(10),
lambda obj, ctx: numpy.frombuffer(obj.rjust(16, b"\x00"), dtype=numpy.longdouble),
lambda obj, ctx: numpy.longdouble(obj).tobytes()[-10:],
),
"Flag": construct.Byte,
"Str": macro,
},
# else 'Var' variable name type
construct.Select(construct.PascalString(construct.Int32ul, "cp932"),),
),
)
So I looked more into the engine code, and I think it may be possible to get utf-8 to work for string literals (in TLiveParser
expressions), but not for scenario scripts (in TpWord
blocks).
For string literals in expressions, strings are packed and unpacked as full byte arrays (stored as delphi ANSI strings), so the fact that utf-8 is variable width (up to 4-bytes) is not an issue, and the windows MBCS->UTF-16 conversion functions should actually work as expected (if the codepage is edited to CP_UTF8
).
However, for TpWord
blocks, text is always parsed by individual character (glyphs), and never as full strings. And for TWdChar
glyphs, they are always unpacked as 2-byte uints and stored in arrays of TWdChar
class instances, and are never handled as string/byte arrays. When rendering scenario text, they call gdi32 functions to retrieve font glyphs per individual character (not as strings). The 2-byte uint is always what is fed into the MBCS->UTF-16 conversion functions, so we can't actually get away with packing 3 or 4-byte utf-8 codepoints across two TWdChar
s (the engine will always try to render them as two separate codepoints/glyphs). In theory we could potentially try only supporting the unicode range covered by 2-byte utf-8 codepoints. CJK text falls outside this range so it's not ideal, but for latin and cyrillic text we would be covered in this range.
So basically, it's theoretically possible to get partial utf-8 support w/the LM engine, but in practical terms we may just want to stick with DBCS codepages because of the TpWord
scenario script limitation.
Also there's a second codepage constant you probably need to hexedit (used in calls to IsDBCSLeadByteEx
), it should be at offset 1776412
in the latest engine version
I'll probably play around with trying to get partial utf-8 support working over this weekend.
I also messed around with utf-8 to menu items, but no luck at all. Neither Japanese characters nor German umlauts are displayed correctly. Maybe interesting for you:
There are more than one locale=jap detections: CODE:000D270C function return 1=jap 0=others, seems used for /config window CODE:001C06A6 seems to be a localized messagebox.
The window title is always set as ansi string, delphi uses the SetWindowTextA API call. So no utf-8 possible for this. See CODE:0005CF8C @TApplication@SetTitle
I have observed, the menues sometimes missing german umlauts, but this issue seems gone after changing the 1776412 you also found.
to make starting the game engine possible with utf-8 code page, some more code patching to the MultiByteToWideChar API calls is required. If you say you do not require this, maybe this API is changed in Win10 again (I still use win7).
Finally:
Yeah, for the save/load/option menus you will have to translate the appropriate LSBs in ノベルシステム/システムメニュー
- to make starting the game engine possible with utf-8 code page, some more code patching to the MultiByteToWideChar API calls is required. If you say you do not require this, maybe this API is changed in Win10 again (I still use win7).
If this does turn out to be a win 10 vs win 7 issue (not sure yet), I would honestly say it's ok for pylm to not support win 7, given that it is completely end of lifed by microsoft and no longer even receiving security updates. But I'll probably also need to test it in win 8/8.1 as well since they are still microsoft supported versions.
Ok, so after some more investigation, when rendering text, they only use the ansi string version of windows gdi font calls (gdi32.GetGlyphOutlineA
), so glyph lookups only work properly when the text codepage matches the system codepage. The MBCS->wide/utf-16 conversion functions are only used when passing text into windows messaging API calls, but not for font glyph/rendering related API calls.
The reason utf-8 partially worked for me is because I had the experimental/beta "set system locale to utf-8" Windows option enabled, but that's not something we can depend on.
so we are pretty much limited to DBCS codepages, and I don't think it's worth any more effort to try and hack in utf-8 support.
So we are finally drop the utf-8 thing. One more thing I would investigate is: Why does the game crash when running in wine? Seems this is also a localisation thing, the game intro and start menu works, but the game crashes when the first message text should appear. Maybe a font is missing? Do you already know something in this topic?
Description
Hello, @pmrowla ! Tell me, is it possible to somehow change the font that is used by default to some other one, as well as change the locale to another?