Open ParzivalWolfram opened 4 years ago
This is a good idea. It may be somewhat difficult to figure out if a bot's name is intended to be visually similar to your own though.
They tend to merely append to the name, so i'd just check a slice of the same length as your name starting from the first character for a match? This would catch most of them...
I've seen a bunch with no width characters in the middle of the name, but maybe you can just remove those? Compare only letters and numbers and remove everything else from the name. If it's identical then prioritize that bot.
This might be a better way to go for the kick rules in general as I have seen bots using new unicode characters that aren't included in the default rule set. Once you add one to the kick rules, they will just use another. If there was a better way of just comparing to see if there are trying to copy someone else on the teams name then that might be more robust.
This would end up being abusable to hell, as Unicode shenanigans are very common when trying to remove or change characters and a very similar thing led to hacking PSPs a long time ago.
On Sun, Feb 14, 2021 at 5:55 PM Wilson notifications@github.com wrote:
I've seen a bunch with no width characters in the middle of the name, but maybe you can just remove those? Compare only letters and numbers and remove everything else from the name. If it's identical then prioritize that bot.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/PazerOP/tf2_bot_detector/issues/138#issuecomment-778864000, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHE6CQLYI7TRJSKMRIW7SJTS7BPINANCNFSM4OXCR3BQ .
@ParzivalWolfram can you give an example? Couldn't you just covert from unicode to ascii and then remove all bytes that aren't letters and numbers?
Examples: pretty much any string that has been found to crash Apple devices when sent to them. (There are more instances of this than just these two, but there's no good explanation of the others.) Most of them work in similar ways: something small, some edge case, is tailored to always happen. https://youtu.be/hJLMSllzoLA and https://youtu.be/jC4NNUYIIdM
Not everyone on Steam sticks to ASCII, either, from what I've seen a decent number of Steam users use Unicode characters in their names (this includes Emojis!), so you'd have to do that to both players' names, and there's no good way to ensure that the single several-byte Unicode characters don't become several "valid" one-byte ASCII characters in your programming language's standard conversion method, either. This is due to how Unicode works (https://youtu.be/MijmeoH9LT4) and because every language deals with the conversion differently. Some languages, like Python, just give up if any non-ASCII text is present in the string to convert, some will just take the string and reinterpret it, and some will properly convert it to the best of its ability based on some character, removing any character it can't convert.
Text is really really hard to deal with, especially when working with it in the way the TF2 bots work (which can change anytime, possibly even into a potential exploit in the game or bot detector if either dev team slips up.)
On Mon, Feb 15, 2021, 7:21 AM Wilson notifications@github.com wrote:
@ParzivalWolfram https://github.com/ParzivalWolfram can you give an example? Couldn't you just covert from unicode to ascii and then remove all bytes that aren't letters and numbers?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PazerOP/tf2_bot_detector/issues/138#issuecomment-779220265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHE6CQJ7XE2AMIBEEV5CC4TS7ENWDANCNFSM4OXCR3BQ .
I don't see how that would create any problems with something like:
Convert names of everyone to alphanumeric string.
When someone joins, convert their name and compare it to all the others.
If there is a match, warn the user and mark the player as suspicious.
All we are doing is removing all characters that are not within the subset of most commonly used characters in peoples nicknames.
Yes, some people use unicode characters in their name but it's quite uncommon for their name to exactly match someone else's with those characters removed. 99% of the time this would flag a bot trying to mimic someone else.
I don't see how that would create any problems with something like:
Convert names of everyone to alphanumeric string. When someone joins, convert their name and compare it to all the others. If there is a match, warn the user and mark the player as suspicious.
All we are doing is removing all characters that are not within the subset of most commonly used characters in peoples nicknames.
Yes, some people use unicode characters in their name but it's quite uncommon for their name to exactly match someone else's with those characters removed. 99% of the time this would flag a bot trying to mimic someone else.
My points were as follows:
I'd like to, however, add an additional point:
More needs done in the check than "just drop all non-English characters."
A very resource intensive but unfailable approach would be to take the text rendering logic from TF2's 2017 leak and use that to visually compare names. This way, any new characters that they try to use will be detected the same way. But I don't think the benefit from this could possibly justify the cost.
Besides uh... copyright and all that
I'm pretty sure they just use directwrite or some other windows-specific text rendering scheme, thats why fonts look bad on linux/mac. I don't think it would be too impossible to get working, its one of the options i've been considering.
I had a little look at the ICU API for Unicode strings. I think there is a way to do what I said previously. You need to do something like:
Normalize string to NFKD form
(this replaces characters that look identical and splits up all the combining characters)
Filter out all non-printing and diacritic marks
Use uspoof_areConfusable() to check if the resulting string looks similar to any of our other results
I can't figure out how to get the filtering to work though. I'm very inexperienced with C++. I think you need to use transliteration but reading the docs is like reading greek to me. There doesn't seem to be a simple list of character categories that can be easily filtered.
This is dependent on #218
A very resource intensive but unfailable approach would be to take the text rendering logic from TF2's 2017 leak and use that to visually compare names. This way, any new characters that they try to use will be detected the same way. But I don't think the benefit from this could possibly justify the cost.
Besides uh... copyright and all that
about this: I think it's worth having a talk with mastercoms, since I remember her benchmarking the impact of net_graph, so she might have insights as to how TF2 draws text
Like most games, TF2 uses a font page system, so drawing characters that have previously been drawn is very very cheap.
I wrote a little bit of test code to see if I could get this working.
Here it is : https://github.com/andy013/unicode-string-compare-test/blob/main/Unicode%20test2.cpp
You can run it in Visual Studio as long as you add "icu.lib" to your additional dependencies in Project Properties > Linker > Input. (I think this is included in most recent versions of Windows 10)
You can change the strings at the top of the file and see if you can break it. The code is probably really bad but this is just a proof of concept to see if it's something worth doing.
This won't be able to detect if someone copies your name and then adds on a small symbol or punctuation mark to the end. You could be more aggressive with the filtering but then you potentially let through more substitutions for letters.
I updated it to just filter out a bunch of punctuation now. I think it's probably better to be more aggressive with the filtering as substitutions are weaker than blank characters since they can only be used to copy people who use a specific character in their name.
Something like this might be useful as an early warning for notifying the user that it looks as if someone is trying to copy another players name. Then the user can double check to see if it's a bot and add them to their ban list if it is.
Btw, does anyone know if there is a way to change your name in TF2 without going through steam (possibly with some cheat command)? I tried to use the name console command but it just seems to be overwritten by your steam name. It would be useful to be able to change this while testing. EDIT: I figured out you can change the name of bots by using "bot -name "bots name"" when you are adding them. This is useful for testing as you can paste a bunch of unicode characters into the bots name and then see how they appear in-game.
This is unnecessary now due to this update: https://www.teamfortress.com/post.php?id=85643
This is unnecessary now due to this update: https://www.teamfortress.com/post.php?id=85643
* Added a ConVar to control players changing their name during a match -tf_allow_player_name_change: default is 1 -Matchmaking servers will set this to 0
The change is still needed. Before the cvar was added players were already unable to change names while inside a match by normal means, this cvar was just made so community servers could allow the functionally back if they so desired
You are still able to steal a player's name easily by looking up the valve casual server you are about to connect to using the SteamAPI, which lists the in-game usernames of the players currently connected, then tell the game your name is the same as one of them (besides an invisible character) once you ate connecting. This lets you steal the name of a player without ever having to change your name while connected and its the way name stealing bots have been doing it for a long time
If possible, detect if a bot has stolen YOUR name and prioritize kicking that bot.