Closed rodneyrehm closed 3 years ago
I think it's a bug because unicode without variation selector is just text and if there is variation selector after it should be rendered as image and only then it should be match as emoji.
I have another issue, What about pizza symbol:
🍕️
"\ud83c\udf55\ufe0f"
https://emojipedia.org/slice-of-pizza/
U+1F355 U+FE0F
is this emoji with variation selector a valid emoji sequence? Becasue it matches only first two codepoints of surrogate pair and ignore variation selector.
@jcubic Let's look up U+1F355 in https://unicode.org/Public/emoji/latest/emoji-data.txt. It contains:
1F337..1F37C ; Emoji # 6.0 [70] (🌷..🍼) tulip..baby bottle
...
1F337..1F37C ; Emoji_Presentation # 6.0 [70] (🌷..🍼) tulip..baby bottle
It has Emoji_Presentation=True
, so it doesn't need the U+FE0F to be displayed as an emoji (per the spec).
Oh, thanks, I was not sure about this one.
macOS and iOS emoji input has recently improved to be more in line with the spec. I'm hoping Apple has solved (or will continue to solve) this problem so that we don't need to work around it in emoji-regex.
For your use case, you could check if the string ends with a variation selector, and remove it before further processing the string.
Following up on this tweet:
I'm using emoji-regex to identify if the last symbol of a string is an emoji. While this works for most emojis, it does not for
⚽️
. As it turns out this is happening because macOS insertsU+26BD
followed byU+FE0F
and that trailing variation selector is not part of the emoji-regex match.While I don't think this is a bug in emoji-regex I do believe emoji-regex could help avoid this situation by including the unnecessary variation selector in the match.