FreeFeed / freefeed-server

FreeFeed server
https://freefeed.net
MIT License
42 stars 18 forks source link

Do not count invisible characters in displaynames #383

Open indeyets opened 5 years ago

indeyets commented 5 years ago

We have a minimal number of characters in displayname checks, but the problem is, that it considers invisible characters as meaningful data too. It should not.

I suggest we remove all invisible characters from displayname before checking its length. This way, only visible characters would be considered and we would not end in situation when user specifies 3 invisible characters and gets an empty displayname as the result.

see https://stackoverflow.com/questions/11598786/how-to-replace-non-printable-unicode-characters-javascript for some inspiration.

abbra commented 5 years ago

Does it mean filtering them completely or accepting but discounting?

indeyets commented 5 years ago

@abbra "accepting but discounting"

abbra commented 5 years ago

Thanks. So, that means we can use a shorter table: http://jkorpela.fi/chars/spaces.html to filter out before supplying the string to countBreaks().

davidmz commented 5 years ago

No, this is a table of spaces, not of invisible characters.

abbra commented 5 years ago

What you call 'invisible characters' are actually called 'whitespace characters' in Unicode. For example, https://en.wikipedia.org/wiki/Whitespace_character#Unicode lists 25 of those, in addition to 6 characters that have no WS property in the current standard but effectively represent a (potentially zero-width) white space, e.g. invisible.

Do you have anything on top of those 31?

davidmz commented 5 years ago

I made a mistake, I meant not invisible, but non-printable characters. Unfortunately, the issue description does not specify which characters are meant and what kind of username is wrong. But whitespace characters in screennames are perfectly acceptable (except for leading and trailing spaces), although we probably should not count zero-width spaces.

But besides zero-width spaces, there are also other non-printable characters, see https://en.wikipedia.org/wiki/C0_and_C1_control_codes and https://en.wikipedia.org/wiki/Unicode_control_characters

abbra commented 5 years ago

Ok, thanks for the confirmation. I'll look into it.

davidmz commented 5 years ago

I'd like to point out that invisible symbols aren't the only thing. Words "ä̍̎̏" or, for example, "ёж" are also valid screennames.