BadgerCode / tttdamagelogs

Damagelogs and RDM Manager for Trouble in Terrorist Town (a Garry's Mod gamemode)
GNU General Public License v3.0
20 stars 23 forks source link

Report reason/response do not support unicode characters #95

Closed BadgerCode closed 1 month ago

BadgerCode commented 1 year ago

Steps to reproduce

  1. Report a player
  2. Make the reason ㄱ ㄴ ㄷ ㄹ ㅁ ㅂ ㅅ ㅇ ㅈ ㅊ ㅋ
  3. Make the reported player's response ㄱ ㄴ ㄷ ㄹ ㅁ ㅂ ㅅ ㅇ ㅈ ㅊ ㅋ
  4. View the RDM manager reports

Expected outcome

Actual outcome

image

GordonFrohman11 commented 2 months ago

So you said that this is related to the database that is in this issue: https://github.com/BadgerCode/tttdamagelogs/issues/112 is there a workaround to fix it temporarily? and how long will I have to wait for it to actually be resolved? because this is a serious problem on a non-English server such as my server.

BadgerCode commented 2 months ago

Heyo! Unfortunately no workaround I know of. As far as I know, this issue has always existed in the addon.

I agree this is a major issue for non-english speaking servers.


Having thought about it some more, I don't think this is actually an issue with the database. Reports are not stored in the database; they are sent between the server and client using the net messages library. I'll need to do more testing.

BadgerCode commented 2 months ago

Ok I've done a quick investigation and found the cause.

https://github.com/BadgerCode/tttdamagelogs/blob/a927f636c16e4ef5b2bd4d6a92d48cb28bba47e9/lua/damagelogs/server/rdm_manager.lua#L304

Workaround

If you remove the line mentioned above from the code, the special characters ㄱ ㄴ ㄷ ㄹ ㅁ ㅂ ㅅ ㅇ ㅈ ㅊ ㅋ show up correctly.

image


Permanent fix

Now I need to work out what this regular expression is supposed to do. message = string_gsub(string_gsub(message, "[^%g\128-\191\194-\197\208-\210 ]+", ""), "%s+", " ")

I've tracked this back to this issue https://github.com/Tommy228/tttdamagelogs/issues/326 It seems it was possible to crash players using some special characters.

I can see it was changed a few times

Apparently %g means "Printable characters (not including space)" and the capital version is the opposite "The uppercase variant represents the inverse of the set". https://cheatography.com/ambigious/cheat-sheets/lua-string-patterns/

It looks like it does two things

  1. Find and remove undesirable characters
    1. Any character which is not "printable", within a very limited range of ASCII characters for cyrillic/Polish characters or a space
  2. Replace consecutive whitespace (spaces, tabs, etc.) with a single space

Proposed fix

I have re-tested the original issue- https://github.com/Tommy228/tttdamagelogs/issues/326 This no longer causes the game to crash.

I'm going to just remove this check & remove it from the other places in the code. The unicode character set is far too large to try and whitelist valid characters.

If there are any problematic characters, we can find & remove those specific characters.

BadgerCode commented 1 month ago

This will be fixed in the next release