WohlSoft / LunaLua

LunaLua - LunaDLL with Lua, is a free extension mod for SMBX 1.3 game engine, core of the X2 project.
https://codehaus.moe/
GNU General Public License v3.0
33 stars 12 forks source link

Broken compatiblity with legacy LVL files in ANSI format #78

Open Flower35 opened 1 week ago

Flower35 commented 1 week ago

Hello! I would like to report an issue regarding the loading of legacy episodes made in the "SMBX 1.3" engine.

In short, it is expected that old file formats (".wld", ".lvl") retain the ANSI encoding, while the newer formats (".lvlx", ".lua") use the UTF-8 codepage. The encoding of ".lvl" and ".lvlx" should not be mixed (LVL misintepreted as UTF-8).


The handling of WLD files is done via the original Public Sub OpenWorld(FilePath As String) subroutine, thus the file contents are read in ANSI encoding (specific to current Windows CodePage) and translated to UTF16-LE encoding. This potentially affects any referenced LVL and LVLX filenames with diacritics and other Unicode symbols.

https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/Misc/RuntimeHookComponents/RuntimeHookGeneral.cpp#L1712 https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/Misc/RuntimeHookComponents/RuntimeHookHooks.cpp#L2162

This case is partially checked with the GetNonANSICharsFromWStr() function (the SMBX installation path and the WLD file path, but NONE of the world map contents).

https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/GameConfig/GameAutostart.cpp#L63-L69


A problem appears with the custom Public Sub OpenLevel (FilePath As String) hook, which was written to handle both the old LVL and the new LVLX formats:

https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/Misc/RuntimeHookComponents/RuntimeHookGeneral.cpp#L1702 https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/Misc/RuntimeHookComponents/RuntimeHookHooks.cpp#L1634

https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/FileManager/SMBXFileManager.cpp#L119


The "PGE File Library" correctly detects the file format and uses the correct encoding to read level data: https://github.com/WohlSoft/PGE-File-Library-STL/blob/f2b83a89ce04ad5a40f1249d1c125b53de6f1750/src/file_rwopen.cpp#L85

However, the discrepancy between the ANSI and UTF-8 encoding was not taken into account in the LunaLua_loadLevelFile() function. https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/FileManager/LoadFile_Level.cpp#L55

Specifically, the encoding of the following strings could be affected: https://github.com/WohlSoft/PGE-File-Library-STL/blob/f2b83a89ce04ad5a40f1249d1c125b53de6f1750/src/smbx64/file_rw_lvl.cpp#L179

// "PGE-File-Library-STL/src/smbx64/file_rw_lvl.cpp"

// Internal level name

SMBX64::ReadStr(&FileData.LevelName, line);

// Custom music filepath

SMBX64::ReadStr(&section.music_file, line);

// Various layer names

SMBX64::ReadStr(&blocks.layer, line);
SMBX64::ReadStr(&blocks.event_destroy, line);
SMBX64::ReadStr(&blocks.event_hit, line);
SMBX64::ReadStr(&blocks.event_emptylayer, line);

SMBX64::ReadStr(&bgodata.layer, line);

// NPC message boxes (even though only the ASCII font is supported)

SMBX64::ReadStr(&npcdata.msg, line);

SMBX64::ReadStr(&npcdata.layer, line);
SMBX64::ReadStr(&npcdata.event_activate, line);
SMBX64::ReadStr(&npcdata.event_die, line);
SMBX64::ReadStr(&npcdata.event_talk, line);
SMBX64::ReadStr(&npcdata.event_emptylayer, line);
SMBX64::ReadStr(&npcdata.attach_layer, line);

// The level filename of a warp target (pipe or door)

SMBX64::ReadStr(&doors.lname, line);
SMBX64::ReadStr(&doors.layer, line);

SMBX64::ReadStr(&waters.layer, line);

SMBX64::ReadStr(&layers.name, line);

// Various event names

SMBX64::ReadStr(&events.name, line);
SMBX64::ReadStr(&events.msg, line);

SMBX64::ReadStr(&events_layers.hide, line);
SMBX64::ReadStr(&events_layers.show, line)
SMBX64::ReadStr(&events_layers.toggle, line);

SMBX64::ReadStr(&events.trigger, line);
SMBX64::ReadStr(&events.movelayer, line);

When entering a warp to another level (NOT from the world map), this error message appears on the screen: https://github.com/WohlSoft/LunaLua/blob/29450a6a517cb852026e301f9603b13fa1c8f6e2/LunaDll/FileManager/SMBXFileManager.cpp#L95-L97

I solved this issue by manually converting old files to remove any unicode characters from the world map, custom music filenames, and LVL filenames.

However, the LunaLua_loadLevelFile() function could be updated to properly handle cases where StrA2WStr() should be called instead of Str2WStr(), for example:

const LevelDoor& nextDataLevelDoor = outData.doors[i];

if (outData.meta.RecentFormat == LevelData::SMBX64)
{
    // ANSI (legacy format)
    nextDoor->warpToLevelFileName = StrA2WStr(nextDataLevelDoor.lname);
}
else
{
    // UTF-8
    nextDoor->warpToLevelFileName = nextDataLevelDoor.lname;
}

(similar change for all the layer names, event names, and custom music filepaths)


Wohlstand commented 1 week ago

The main problem of ANSI levels in genral that they will always fail on non-native computers. For example, file made with German filenames that use diacritics will fail to open at Greek machine. And Chinese filename will fail to open at Russian, etc. That why legacy levels are required to be named with ASCII only characters.

Flower35 commented 1 week ago

The main problem of ANSI levels in genral that they will always fail on non-native computers. For example, file made with German filenames that use diacritics will fail to open at Greek machine. And Chinese filename will fail to open at Russian, etc. That why legacy levels are required to be named with ASCII only characters.

Thank you for a quick reply! Yes, I agree. However, when I copy over an old episode from SMBX 1.3 to 2.0, made on the same PC (or made by someone else using the same locale setting), I would still expect it to be playable with zero errors.

The issue is not only with filenames from different codepages (which can be translated to ASCII without much problems using some simple script, outside of the SMBX/PGE), but also making sure any "cross-level" references (on the world map and warps between levels) are also changed in those legacy episodes to ASCII, if possible.

Wohlstand commented 6 days ago

Yea, anyway, I guess, I could try to implement the proper charset converter at the Moondust Maintainer tool that converts episodes between different format. Right now it spits on ANSI and expects ASCII or "local 8bit chatset", but I could add an option to select the source/destination charaet for SMBX64 levels/worlds to properly convert between modern/legacy formats.