Closed roderickhsiao closed 7 years ago
That’s the expected behavior. E.g. 1
is a text emoji. See http://unicode.org/Public/emoji/5.0/emoji-data.txt:
0030..0039 ; Emoji # 1.1 [10] (0️..9️) digit zero..digit nine
Per spec, they’re only supposed to be rendered in emoji form when followed by a variation selector: http://unicode.org/Public/emoji/5.0/emoji-sequences.txt But since you’re using the text
regex you opt in to matching them anyway.
Thanks @mathiasbynens I guess we will need to handle on our side then 👍
cheers
emoji-regex matches emoji according to the Unicode Standard. It sounds like you want to do something else — how do you determine what’s an emoji and what isn’t?
Yes, we are parsing a string and try to extract the emoji, currently after parsing we are getting the number which probably shouldnt present as Emoji in our case
we basically just split the sentence and check individual emojiRegex().test(c)
to get emoji in sentence
You didn’t answer the question — for your use case, how do you decide what constitutes an emoji and what isn’t?
We add a flag for parsed emoji which match the unicode spec and check if browser render an emoji (icon) for that purely.
Checked the spec, probably we want to exclude
0023 ; Emoji # 1.1 [1] (#️) number sign 002A ; Emoji # 1.1 [1] (*️) asterisk 0030..0039 ; Emoji # 1.1 [10] (0️..9️) digit zero..digit nine
But you are absolutely correct, those are consider valid emoji.
When the string has number inside, emoji-regex/text matches it
version 6.4.0