Closed cher-nov closed 3 years ago
This is definitely something I'm interested in implementing (and was aware of, just haven't yet), the text mangling is pretty ugly, sorry about that. I wish the runner would tell you what encoding the executable is in, which it doesn't as far as I know?
I wish the runner would tell you what encoding the executable is in, which it doesn't as far as I know?
AFAIK, yes, it doesn't. The GMKs and, therefore, EXEs just store raw character data.
UTF-8 support was introduced only in GM 8.1. Before that, all GM versions had used Windows single-byte encodings, e.g. Windows-1251 for Cyrillic languages such as Russian. https://en.wikipedia.org/wiki/Windows_code_page
Since the specific code page to be used depended on the settings of the local computer (most often on the localization of Windows actually), a program, compiled on non-UTF version of GM with strings, for example, in Russian, could display them incorrecly while being run on a computer with default code page other than 1251.
Currently such texts (string literals, comments, you name it) are unsupported, resulting in somewhat broken GMKs of such programs.
So, I would propose to add a command-line parameter to specify Windows code page to be used for GML scripts and other strings decoding on decompilation of such executables, with default value of somewhat like
CP_ACP
just as described here: https://docs.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-multibytetowidechar