Fix encoding and locale issues

pR0Ps commented 3 years ago

When running Chimera 1ef91c18 with a non-ASCII path (set via the -path arg, the path=... setting in chimera.ini, or just the default path with your Windows username in it), Chimera would crash, sometimes creating 2 new folders with mangled names (one with just the chimera/lua folder, other other with the rest of the files). An example of a problematic path is "ÁÉÚÝÍÓŠ".

Note that ANSI is used as a shorthand by programs for "the system's preferred 8-bit encoding". In my case that's CP-1252 (en-US locale).

When running with -path ÁÉÚÝÍÓŠ or using a ANSI-encoded chimera.ini (halo crashed): prev_ansi

When using a UTF-8-encoded chimera.ini (notepad.exe and notepad++ both saved in UTF-8 by default) (halo crashed): prev_utf8

Setting the locale of the program to match the system's configured locale fixes the issues with the default path and the -path argument. It also allows chimera.ini to be read properly as long as the file is encoded using ANSI.

Because ANSI isn't portable across systems and most text editors now save everything in UTF-8 by default, the ini parser was modified to parse values as UTF-8 strings and internally convert them to their ANSI equivalents. Error messages have been added for cases where the file isn't encoded properly, as well as when a properly-encoded UTF-8 character can't be converted to ANSI.

Many thanks to Adam Šliperski for discovering these issues and working with me to fix them.

Some screenshots with the fixes applied to give an idea of the different error conditions:

non-ASCII characters encoded in ANSI: invalid_encoding

Same as above but only in a comment: working_ansi

A valid UTF-8 character that can't be converted to ANSI: invalid_character

Same as above, but only in a comment: working_utf8

pR0Ps commented 3 years ago

@SnowyMouse I've updated the PR to include better error checking and messages. At this point I think this is done. Let me know if you have any questions/suggestions.

pR0Ps commented 3 years ago

Rebased on master to include the build check (it works! And you can download the artifact to test it without having to checkout+compile it yourself!)

pR0Ps commented 3 years ago

Note that #68 is needed to see the error messages since the show_error_box function currently segfaults if the ini file hasn't been parsed (I removed the fix from this branch to avoid merge conflicts).

SnowyMouse commented 3 years ago

Thanks!

SnowyMouse / chimera

Fix encoding and locale issues #66