This is the next step after implementing https://github.com/TryQuiet/quiet/issues/2299. It requires more sophisticated handling of the non-latin script in usernames and channel names.
As per the. comment that Holmes left in #2299:
I'm adding this note on how to handle names in non-latin script for users (and possibly channel) without allowing homograph attacks.
The idea is to have a list of tuples of "confusable" glyphs and let you use any of them, but block you from using one if another registered name is the same except for the confusable glyphs.
look at IDNA2008 and UTS46. As a general rule, unicode is meant for display and not string comparison. Each protocol, that supports unicode tends to handle this differently... But usually they require conversion to "A-Labels" before comparing presented and reference identifiers. For example https://datatracker.ietf.org/doc/rfc9525/ & https://datatracker.ietf.org/doc/draft-ietf-lamps-rfc8398bis/. Part of the reason for UTS46 popularity is the common library support for it.
This is the next step after implementing https://github.com/TryQuiet/quiet/issues/2299. It requires more sophisticated handling of the non-latin script in usernames and channel names.
As per the. comment that Holmes left in #2299:
I'm adding this note on how to handle names in non-latin script for users (and possibly channel) without allowing homograph attacks.
The idea is to have a list of tuples of "confusable" glyphs and let you use any of them, but block you from using one if another registered name is the same except for the confusable glyphs.
https://unicode.org/reports/tr46/#Registries
Libraries like this one may help but I'm not sure yet: https://github.com/oozcitak/uts46
Here was the response I received to my question: