Closed mastern2k3 closed 2 years ago
Hi mastern2k3!
Thank you for your contribution!
Yeah, I'll rename the FindAll
method with the following significant changes. However, it seems overkill to break backward compatibility to rename a method name right now.
it seems overkill to break backward compatibility to rename a method name right now.
Strongly agree, will add tests soon.
When I added the tests I noticed I'm getting duplicate results because some emojis were getting registered from both
gr := uniseg.NewGraphemes(s)
for gr.Next() {
if em, ok := emojiMap[gr.Str()]; ok {
emojis[gr.Str()] = em
}
}
and
for _, r := range s {
if em, ok := emojiMap[string(r)]; ok {
emojis[string(r)] = em
}
}
I didn't quite get the difference, it looks like uniseg.NewGraphemes(s)
picks up all emojis but some just don't appear correctly from gr.Str()
and don't hit the correct map keys.
I didn't quite get the difference, it looks like uniseg.NewGraphemes(s) picks up all emojis but some just don't appear correctly from gr.Str() and don't hit the correct map keys.
Yep, you're right. Some emojis are invisible for uniseg.NewGraphemes(s)
and some for for range
by runes. So that's why we use both approaches here.
hey @mastern2k3,
It's been a while since our last interaction. Is everything good? Does my suggestion work for you?
Hey @forPelevin, all is well.
Looks like your suggestion should work.
I'm investigating something different, namely - why doesn't emojiMap[gr.Str()]
hit for all emojis?
Looks like this happens only for a specific set of emojis.
They get picked up by the uniseg.NewGraphemes(s)
iterator, they also appear as emojis when you print them as gr.Str()
, but when put through emojiMap
it doesn't hit.
Hey @mastern2k3.
I'm investigating something different, namely - why doesn't emojiMap[gr.Str()] hit for all emojis? Looks like this happens only for a specific set of emojis.
If you print them by bytes, you'll see hidden bytes. That's why they cannot be found in emojiMap
. Unfortunately, there is no obvious way to pre-generate all the emojis, so we need to double-check everything in runtime.
Hi @mastern2k3
Could you make changes from previos message and I will merge the pull request, please?
@forPelevin I fixed it :+1:
Nice job! Thank you for the PR. I've merged it.
I really wanted to use gomoji but it's missing a way to collect all emojis in a string as-is, without ignoring repeating emojis.
I am building a service that analyzes emojis in strings and need a way to detect which emojis appear and how many times.
I also feel like this behavior is more suiting the name
FindAll
and the current behavior ofFindAll
should be relocated to aFindAllDistinct
.