Closed vijaiendransv closed 3 years ago
See example here https://runkit.com/4ver/issue-with-emoji-regex-and-matching
FWIW, in our app we have to detect emojis by splitting a given string using a regex. The following code helped working around the issue
import * as baseEmojiRegex from 'emoji-regex';
// emoji-regex doesn't match some emojis
// https://github.com/mathiasbynens/emoji-regex/issues/59
const rsSurrPair = '[\\ud800-\\udbff][\\udc00-\\udfff]';
const rsEmoji = `(?:${rsSurrPair})`;
export const emojiSplitRegexp = new RegExp(
`(${baseEmojiRegex.default().source}|${rsEmoji}?)`,
'i'
);
I couldn't figure out how to make a version that works with the .exec()
usage however.
The package emoji-patterns is a great alternative and works rather well.
> const emojiPatterns = require ('emoji-patterns');
undefined
> const emojiAllRegex = new RegExp (emojiPatterns["Emoji_All"], 'gu');
undefined
> '🕹'.match(emojiAllRegex)
[ '🕹' ]
> 'does not match 👁 and 🕶 on MacOS Mojave'.match(emojiAllRegex)
[ '👁', '🕶' ]
Generally, the philosophy behind emoji-regex is to avoid making decisions about which characters/sequences are emoji and which aren't, and instead let the Unicode Standard make those decisions.
As long as these symbols are not official emoji, we don't want to match them. (And if/when they become official, we’ll automatically start matching them once we update the library for the new Unicode version.)
Tested the regex expression using http://regex101.com/. Can closeout if it is not an actual issue.