iamcal / emoji-data

Easy to parse data and spritesheets for emoji
MIT License
2.57k stars 304 forks source link

Reorder Codes #170

Closed VVapiti closed 4 years ago

VVapiti commented 4 years ago

Is it possible to change the order of emoji codes from complex to simple? Or change the map generation in https://github.com/iamcal/php-emoji/ according to the problem?

Explanation: in the current order the php-emoji, js-emoji and maybe other scripts convert the first found code. That is, for complex emojis, this is only part of the entire emoji.

Example:

👩‍👩‍👦‍👦 (woman-woman-boy-boy)

Part of the PHP code: ... "\xf0\x9f\x91\xa9\xe2\x80\x8d\xf0\x9f\x91\xa9\xe2\x80\x8d\xf0\x9f\x91\xa6"=>"1f469200d1f469200d1f466", "\xf0\x9f\x91\xa9\xe2\x80\x8d\xf0\x9f\x91\xa9\xe2\x80\x8d\xf0\x9f\x91\xa6\xe2\x80\x8d\xf0\x9f\x91\xa6"=>"1f469200d1f469200d1f466200d1f466", ... With the current order of codes, conversion will occur only on the first occurrence. Thus, the emoji is divided into 2: woman-woman-boy and boy.

iamcal commented 4 years ago

that's a bug in php-emoji ; it should sort the map to have longer matches first to avoid this. closing the bug here, since we always use lexical codepoint order sort here for consistency