AmyrAhmady / samp-node

a SA-MP plugin to run nodejs scripts
71 stars 8 forks source link

Non-Latin symbols issues #2

Open BonhommeG opened 4 years ago

BonhommeG commented 4 years ago

Hello,

There are some issues in using UTF8 encoding, because SA is an ANSI based game (https://forum.sa-mp.com/showpost.php?p=3417389&postcount=10). So, when using text output functions with non-Latin characters (such as SendClientMessage function or OnDialogResponse callback), the text becomes unreadable.

I tried to realize an encoding conversion to win1251 (https://github.com/BonhommeG/samp-node/commit/c1f2c471afd8b5963f6cf88d1decc5be69235896), but there are so many codepages in ANSI (https://en.wikipedia.org/wiki/Windows_code_page#List) and I didn't found proper solution for all cases.

But I suppose we can find something useful in @ikkentim solution to a similar problem (https://github.com/ikkentim/SampSharp/tree/master/env/codepages).

AmyrAhmady commented 4 years ago

Hey!

Thanks for your report! I just took a look at your commit in your fork, and the question is does your method for encoding conversion actually work (in that codepage only, obviously)? if it does, I can try make a more optimized version of it and add other codepages as well, though I wonder how much is it going to hurt the performance of calling natives and event on public calls.

BonhommeG commented 4 years ago

does your method for encoding conversion actually works (in that codepage only, obviously)?

Yes, this works properly for my case

AmyrAhmady commented 4 years ago

Glad to hear that, I'll look into it then

TheEVolk commented 3 years ago

image image

iconv-lite npm package fixes this, but I don't think it can be used permanently. The native version would be much better.

dockfries commented 2 years ago

same problem.

More precisely, it is not only Latin that has these problems, but also any non-ascii language that is internationalized, e.g. Chinese, Russia, because the character set of sa:mp chat boxes and dialogs depends on the character encoding of the windows system, and for the sa game itself textdraw and gametext are displayed by ascii(maybe).

TheEVolk's solution is desirable and may be the only solution, the principle should be to convert the strings to the appropriate array of bytecodes (which looks like an array of decimal numbers), and then pass the javascript es6 syntax conversion and add a 0 at the end, which I think might be telling the bytecodes to end. Ultimately let the system decide how to automatically convert these bytes into strings that people can understand. In this way can solve the display of characters. @AmyrAhmady

So I think for all actionable chat events/dialog events there should be an additional parameter (default value is the configurable character set in samp-node.json) to specify the character set encoding and convert it internally by iconv.encode, and for events like onPlayerText, also by iconv.decode back to utf8.


A better solution should depend on judging the game version and the player's autonomy to choose the character set and convert it.

The remakes of the trilogy are developed for the next generation and the more standardized utf8, so when future open.mp release the client for player connect to the remake version, conversion is usually not required, while the ancient san andreas of the past should be chosen by the player.

Finally, thank you for giving us a better option to use javascript / typescript to program.