mathiasbynens / emoji-regex

A regular expression to match all Emoji-only symbols as per the Unicode Standard.
https://mths.be/emoji-regex
MIT License
1.72k stars 175 forks source link

Add RGI_Emoji output #75

Closed mathiasbynens closed 3 years ago

mathiasbynens commented 3 years ago

In the same vein as #71, this patch reduces the number of hardcoded Unicode properties in the source code by leveraging the new RGI_Emoji property as much as possible.

This brought to light some more inconsistencies between the Unicode Standard's definitions and real-world implementations, similar to those that have previously been discussed in this repo's issue tracker.

For example, U+1F575 U+FE0F is a qualified emoji per emoji-sequences.txt:

1F575 FE0F    ; Basic_Emoji                  ; detective                                                      # E0.7   [1] (🕵️)

But without the U+FE0F, it's not, even though it still renders as an emoji on macOS — and so folks might expect emoji-regex to match it. Our previous regex matched it, which is wrong per spec but debatably "right" per macOS.

mathiasbynens commented 3 years ago

For backwards compatibility, I've added a new output target for the new RGI_Emoji regular expression (instead of updating the “main” one).