Open sharpninja opened 2 years ago
Create a black list based on common emojis can be a valid solution ?
@sharpninja look https://unicode.org/Public/emoji/13.1/emoji-test.txt, and https://regex101.com/r/mxJYXs/1
If there is a mathematical relationship the the definition of emojies that would make scanning faster.
Sent from ProtonMail mobile
-------- Original Message -------- On Nov 11, 2021, 1:48 PM, Juan España Garcia < @.***> wrote:
@.***sharpninja look https://unicode.org/Public/emoji/13.1/emoji-test.txt, and https://regex101.com/r/mxJYXs/1
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
https://unicode.org/Public/emoji/14.0/emoji-test.txt
This shows all the emojis, plus how to apply the modifiers to them. I may just strip it down to comma-separated lines to build a whitelist.
I close as soon as it is in nuget
Some code I was examining kept coming up with hits on
\uD83D\uDCCC
which it turns out is a surrogate to the pushpin emoji, which is totally valid to be in source code. Before I can use this and report findings from it we need a table of valid surrogates to exclude.