milesj / emojibase

🎮 A collection of lightweight, up-to-date, pre-generated, specification compliant, localized emoji JSON datasets, regex patterns, and more.
https://emojibase.dev
MIT License
468 stars 38 forks source link

Mappings are wrong on Apple platform #48

Open Midar opened 4 years ago

Midar commented 4 years ago

I have been redirected here from http://github.com/vector-im/riot-web:

It seems the current mappings break Apple platforms: For example, :( is ☚ī¸Ž, which is not rendered on Apple platforms as an Emoji, while 🙁 is an Emoji. :D is rendered as 😁, when it is more commonly rendered as 😀 (not really a breakage, just inconsistency). Things like :thumbsup: result in 👍ī¸, which is rendered as a thumb up followed by a white box, while 👍 renders correctly.

Basically, look at the entire list on an iOS or Safari on a mac device, and most of them are broken.

milesj commented 4 years ago

@Midar This really depends on what version of emojibase is being used, and that iOS/Safari versions. Do you have more information?

(We use this all the time on osx with no issues).

Midar commented 4 years ago

Latest version for everything. Also even happens with Safari Technology Preview. Firefox, OTOH, renders everything correct when a different Emoji font is used.

The ☚ī¸Ž seems to be universally wrong outside of Firefox, though.

milesj commented 4 years ago

@Midar Well emojibase-data v5 came out last week (emoji v13 support). Is it that version or v4?

There screenshots or anything else? Theres not much I can do to reproduce this on my end.

This is me testing :thumbsup: + SVG replacement on v4.

Screen Shot 2020-03-20 at 2 44 29 PM
Midar commented 4 years ago

Here's two examples of what I mean:

Bildschirmfoto 2020-03-21 um 17 16 28 Bildschirmfoto 2020-03-21 um 17 16 55
milesj commented 4 years ago

Do you know where that happens in the code? I can check their implementation.

Midar commented 4 years ago

Unfortunately I don't. Let's loop someone in: @t3chguy

milesj commented 4 years ago

I figured out the :( one, but the thumbs up I'm still not sure.

t3chguy commented 4 years ago

So riot-web is currently using v4, I will get that updated to see if the mappings change.

milesj commented 4 years ago

This should fix the emoticons: https://github.com/milesj/emojibase/commit/a39d1a83946d2420342a54165383e252966510dd (I'll release it later)

Need more info for the thumbs up.

milesj commented 4 years ago

Actually, I found the root problem. So heres the source data for 2639.

Screen Shot 2020-03-21 at 12 04 33

The issue is that the default presentation type is text (0) instead of emoji (1), and that's why the non-emoji character is being used.

I'm assuming the compact dataset is being used? Maybe I'll just always force it to emoji since that's typically what people want.

t3chguy commented 4 years ago

Yes the project uses the compact dataset

milesj commented 4 years ago

Ok, I'm gonna do a retro breaking change on v5, so compact will always use emojis: https://github.com/milesj/emojibase/commit/14d8b9794d88dbfa1e765d1bf6e677654172e540#diff-8128864d03457e2abb445b92204a88feL25

When you update to v5, let me know if this is still an issue.

t3chguy commented 4 years ago

@Midar could you re-test on riot.im/develop which should be on emojibase-data@5 and emojibase-regex@4

Thanks

Midar commented 4 years ago

The sad Emoji now has the same white box on the right of it liks others like 👍 already had.

milesj commented 4 years ago

Hrmm, really can't help more unless I know which emoji hexcode is being rendered, and how, and what the actual dataset for that item looks like.

Midar commented 4 years ago

Finally got to test this now that there's a new Riot release:

So it seems :) is now fixed, but the other problems remain.

t3chguy commented 4 years ago

Sorry, should have updated here. Riot has had to roll back to emojibase-data@4 due to us not having the time to create an updated build of Twemoji to ship with the app for emoji 13 support. So how the app is now should match the state of this issue when it was created. :(

ajbura commented 2 years ago

Hi, I am using emojibase-data@6.2.0 and in app.cinny.in and having similar issues. Ping me if you need more info.

Cinny-emoji-issue

slugalisk commented 2 years ago

This is still an issue using the compact dataset from emojibase-data@7.0.1. There are 342 affected entries https://gist.github.com/slugalisk/750e211687215a1dc7be528ed3004e91. For some reason the unicode field is generated with an extra FE0F even though the hexcode looks right.

edit: It seems like the reason this is happening is the unicode samples in the source doc (ex. https://unicode.org/Public/14.0.0/ucd/emoji/emoji-data.txt) for all the Extended Pictographic emoji end with an FE0F codepoint. ex:

extra-codepoint

This variation selector (https://unicode.org/faq/emoji_dingbats.html#5.1) is meant to indicate that the emoji should be colored rather than black and white. Unfortunately iOS renders this character incorrectly when using some fonts like the Twemoji font in the screenshot above from @ajbura.

So emojibase has valid unicode values for the Extended Pictographic emoji but iOS renders them incorrectly in some cases. The iOS emoji keyboard also includes the variant selector when writing these emoji.

edit2: @ajbura The easiest solution for this is probably to fix the font. In my fork of twemoji-colr I renamed all the assets to include the variant selector (https://github.com/MemeLabs/twemoji-colr/blob/master/hack/rename_glyphs.sh) and everything renders correctly now.

milesj commented 2 years ago

@slugalisk Thanks for digging into this, I haven't had much time. Since this seems to affect compact, does non-compact work fine?

slugalisk commented 2 years ago

@milesj It's not an issue with emojibase at all IMO. The unicode values for Extended Pictographic emoji in emojibase don't match the hexcodes which confused me for a minute but they are the correct values for the colored variants of the emoji.

The issue the Riot devs and I had was that the font used only the base code point for those emoji so the variant selector was treated as a separate character. On windows it's discarded as nonprintable but on iOS it's rendered with a broken character glyph or an empty space depending on the OS version.