Closed ivanjaros closed 1 year ago
I suppose uniseg
could help you do that. However, you would need to copy some code over to your own project, including the grapheCodePoints
and emojiPresentation
tables (although graphemeCodePoints
could be greatly reduced to only include the relevant emoji code points), because I'm not planning on making these internal functions and tables public.
You can take a look at FirstGraphemeClusterInString()
and runeWidth()
. These functions need to detect emojis to calculate a width of 2 for them. So this is what I would do:
uniseg
to break string into grapheme clusters.emojiPresentation
table. If it gives you the "emoji presentation" flag, it's an emoji.This procedure considers ♫ not an emoji. If you want to eliminate these, too, then it's a bit different (and simpler, because you wouldn't need the emojiPresentation
table or the check for the "Variation Selector-16", and emojis could have a width of 1).
thanks, i'll give it a try.
Hey @rivo, I've stumbled upon this, and I'm trying to detect emojis without copying any code from uniseg
with this function, the only thing that i'm missing is checking the extended pictographic property.
// see https://github.com/rivo/uniseg/issues/27
func isEmojiCluster(w int, runes []rune) bool {
if w != 2 {
return false
}
if len(runes) > 0 && runes[0] >= regionalIndicatorA && runes[0] <= regionalIndicatorZ {
return true
}
for r := range runes {
if r == variationSelector16 {
return true
}
}
// TODO: detect extended pictographic property
return false
}
Would you be ok with adding IsEmoji(width int, b []byte) bool
and IsEmojiInString(width int, str string) bool
to uniseg
? I can send a PR for this
This would be a great addition and hope this might be considered for merge.
Since there is the
emojiPresentation
map, could this library be extended to detect emojis? I have a use case where I want to remove emojis from text but due to lack of options it seems I have to use the github.com/forPelevin/gomoji, which uses this library, but it has the entire emoji db that is 1.25MB map that needs to be loaded in memory, which I am not liking. Hence my question.