cheahjs / palworld-save-tools

Tools for converting Palworld .sav files to JSON and back
MIT License
783 stars 68 forks source link

not correctly handeling special characters in nicknames #30

Closed Alexander3a closed 7 months ago

Alexander3a commented 7 months ago

i named my player \udb40\udc21\udb40\udc21 \u2067Alex and well it does not correctly get handled it is able to create the dict no problem but crashes when trying to json.dump

so to replicate set ur player name to some special characters for example right to left overrideand it will crash when converting dont really think its a problem with uesave since it handels it correctly just does not escape it but not sure

cheahjs commented 7 months ago

Can you post a log of the error? It's meant to handle unicode characters just fine.

Alexander3a commented 7 months ago

Traceback (most recent call last): File "\palworld-save-tools\convert-single-sav-to-json.py", line 42, in main() File "\palworld-save-tools\convert-single-sav-to-json.py", line 38, in main json.dump(json_blob, f, indent=1, cls=CustomEncoder, ensure_ascii=False) File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64qbz5n2kfra8p0\lib\json\init__.py", line 180, in dump fp.write(chunk) UnicodeEncodeError: 'utf-8' codec can't encode characters in position 1-4: surrogates not allowed

Alexander3a commented 7 months ago

Can you post a log of the error? It's meant to handle unicode characters just fine.

yeah but this is parsed by the uesave tool so its handeled differently it think it is supposed to be handled in the custom json encoder

cheahjs commented 7 months ago

ok, I see what you mean now. Python is attempting to decode/encode the unicode characters when the tool should really treat it as arbitrary bytes. Will have a think about how to workaround this.

toriato commented 7 months ago

If you exclude all characters that UTF-8 cannot process, it roughly solves the problem. It seems like GVAS uses UTF-16(LE), but I'm not entirely sure as this is a rough guess. This is a workaround I use. It may result in the loss of unsupported Unicode characters, but it doesn't seem to cause major issues with the save file.

    # convert-single-sav-to-json.py (line 35)
    with open(output_path, "w", encoding="utf8") as fp:
        for chunk in CustomEncoder(ensure_ascii=False, indent=2).iterencode(json_blob):
            fp.write(chunk.encode("utf-8", "ignore").decode("utf-8"))
Alexander3a commented 7 months ago

its

diff --git a/convert-single-sav-to-json.py b/convert-single-sav-to-json.py
index d2f763d..8af3bd6 100755
--- a/convert-single-sav-to-json.py
+++ b/convert-single-sav-to-json.py
@@ -32,8 +32,10 @@ def main():
     gvas_file = GvasFile.read(raw_gvas, PALWORLD_TYPE_HINTS, PALWORLD_CUSTOM_PROPERTIES)
     output_path = save_path + ".json"
     print(f"Writing JSON to {output_path}")
-    with open(output_path, "w", encoding="utf8") as f:
-        json.dump(gvas_file.dump(), f, indent=2, cls=CustomEncoder, ensure_ascii=False)
+    with open(output_path, "w", encoding="utf8") as fp:
+        for chunk in CustomEncoder(ensure_ascii=False, indent=2).iterencode(gvas_file.dump()):
+            fp.write(chunk.encode("utf-8", "ignore").decode("utf-8"))
+        #json.dump(gvas_file.dump(), f, indent=2, cls=CustomEncoder, ensure_ascii=False)

 if __name__ == "__main__":

but cant convert it back without "fixing" my name

cheahjs commented 7 months ago

As a temporary fix, you should be able to change ensure_ascii=False to ensure_ascii=True . Python should escape all non-ASCII characters and be able to read/save your nickname. The reason it was set to ensure_ascii=False was that some of the stat properties were named in Japanese, so I wanted to print the actual Japanese characters and not escape sequences for users to easily identify them.

This might cause other issues that I haven't foreseen however, I'm looking into a proper fix.

https://github.com/cheahjs/palworld-save-tools/blob/84778abf0803b8fc7ac28aeda7fb97368a75ed4c/convert.py#L79-L83

cheahjs commented 7 months ago

https://github.com/cheahjs/palworld-save-tools/pull/56 (and thus v0.13.0) should have fixed this. Let me know if you are still running into Unicode errors.