TeamSpen210 / HammerAddons

Useful tweaks and content for Source Engine Games
123 stars 36 forks source link

Exception occur if map has non-ASCII character in the keyvalue #272

Open dounai2333 opened 3 months ago

dounai2333 commented 3 months ago

When the map contain any of non-ASCII character texts in the keyvalue, the postcompiler will throw a exception with following:

[I] postcompiler.main(): Mounting BSP packfile...
[E] logger.except_handler(): Uncaught Exception:
Traceback (most recent call last):
  File "hammeraddons\postcompiler.py", line 381, in <module>
  File "trio\_core\_run.py", line 2010, in run
  File "hammeraddons\postcompiler.py", line 201, in main
  File "srctools\bsp.py", line 1303, in __get__
  File "srctools\bsp.py", line 2917, in _lmp_read_ents
  File "src\\srctools\\_tokenizer.pyx", line 449, in srctools._tokenizer.Tokenizer.__init__
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 275215-275222: surrogates not allowed
[13476] Failed to execute script 'postcompiler' due to unhandled exception!

How to reproduce:

  1. Create a "game_text" entity.
  2. Set any of non-ASCII character text to it's "Message Text", for example "--我是中国人--".
  3. Compile the map with postcompiler and you'll see the exception.

If you remove the minus sign and leave only non-ASCII character, the text will be automatic removed by vbsp and make keyvalue empty.

vrad-exe commented 3 months ago

This might be a limit of the VMF format itself? I was pretty sure it's encoded as ascii

dounai2333 commented 3 months ago

This might be a limit of the VMF format itself? I was pretty sure it's encoded as ascii

Hammer will save the vmf as ANSI encoding, but it doesn't make non-ASCII characters stopped working. If you compile the map without postcompiler it will success with no problem.

edit: I tried to compile the map while vmf is saved as UTF-8 encoding, still same error.

dounai2333 commented 3 months ago

I have added game_text entity to one of my test map, you can give it a try and see how it goes with postcompiler: https://www.mediafire.com/file/2k86k7mssrxt5rq/test_map4.vmf/

TeamSpen210 commented 3 months ago

The proper encoding for Source file formats is not really well specified. VMFs aren't really decoded at all, it just deals directly with bytes and assumes it behaves like ASCII. Any ASCII-compatible encoding. I'm really not sure how my postcompiler should try and be compatible...