mathiasbynens / emoji-regex

A regular expression to match all Emoji-only symbols as per the Unicode Standard.
https://mths.be/emoji-regex
MIT License
1.73k stars 174 forks source link

does not match 👁 and 🕶 on MacOS Mojave #59

Closed vijaiendransv closed 3 years ago

vijaiendransv commented 5 years ago

Tested the regex expression using http://regex101.com/. Can closeout if it is not an actual issue.

4ver commented 5 years ago

See example here https://runkit.com/4ver/issue-with-emoji-regex-and-matching

eramdam commented 4 years ago

FWIW, in our app we have to detect emojis by splitting a given string using a regex. The following code helped working around the issue

import * as baseEmojiRegex from 'emoji-regex';

// emoji-regex doesn't match some emojis
// https://github.com/mathiasbynens/emoji-regex/issues/59
const rsSurrPair = '[\\ud800-\\udbff][\\udc00-\\udfff]';
const rsEmoji = `(?:${rsSurrPair})`;

export const emojiSplitRegexp = new RegExp(
  `(${baseEmojiRegex.default().source}|${rsEmoji}?)`,
  'i'
);

I couldn't figure out how to make a version that works with the .exec() usage however.

niftylettuce commented 4 years ago

The package emoji-patterns is a great alternative and works rather well.

> const emojiPatterns = require ('emoji-patterns');
undefined
> const emojiAllRegex = new RegExp (emojiPatterns["Emoji_All"], 'gu');
undefined
> '🕹'.match(emojiAllRegex)
[ '🕹' ]
> 'does not match 👁 and 🕶 on MacOS Mojave'.match(emojiAllRegex)
[ '👁', '🕶' ]
mathiasbynens commented 3 years ago

Generally, the philosophy behind emoji-regex is to avoid making decisions about which characters/sequences are emoji and which aren't, and instead let the Unicode Standard make those decisions.

As long as these symbols are not official emoji, we don't want to match them. (And if/when they become official, we’ll automatically start matching them once we update the library for the new Unicode version.)