Open Crissov opened 6 years ago
the existing mapping is just what i/we came up with manually. they are intended for mapping a user-entered emoticon to an emoji, so are very conservative. for the opposite direction (representing a given emoji in ascii), the map would be a lot more liberal. might be worth adding a new property for that. they also don't need to be unique.
i've been planning (for some time) to add kaomoji as a distinct property, since there's a semi-official mapping as part of the unicode spec draft
ASCII ↦ Unicode can be n:1. The requirement for Unicode ↦ ASCII should be that the ASCII sequence maps back to the same Unicode emoji. In other words, I believe they need to be unique. Anyway, some of the current mappings don't make sense if expanded.
ASCII | Emoji | ASCII | Emoji | ASCII | Emoji | |||||
---|---|---|---|---|---|---|---|---|---|---|
š | ā | </3 | ||||||||
ā¤ļø | ā | <3 | ā¤ | š š š š | ||||||
š | ā | :( | ā¤ | š | ||||||
): :-( |
ā¦ | š | ā | :( | ||||||
š¢ | ā | :'( | ā¤ | š | ||||||
;-) | ā¦ | š | ā | ;) | ||||||
:-p :-P :P :-b :b |
ā¦ | š | ā | :p | ||||||
;-p ;-b ;b ;-P ;P |
ā¦ | š | ā | ;p | ||||||
: :- |
ā¦ | š | ||||||||
8) | ā¦ | š | ||||||||
:-\ :\ :-/ :/ |
ā¦ | š | ||||||||
:-| :| |
ā¦ | š | ||||||||
:-o :o :-O :O |
ā¦ | š® | ||||||||
>:-( >:( |
ā¦ | š | ||||||||
:o) | ā¦ | šµ | ||||||||
D: | ā¦ | š§ | ||||||||
(: :-) |
ā¦ | š | ||||||||
š | ā¦ | :) | ā¦ | š | ||||||
=-) =) C: c: :-D |
ā¦ | š | ā¦ | :) | ā¦ | š | ||||
š | ā¦ | :D | ā¦ | š | ā¦ | :) | ā¦ | š | ||
:> :-> |
ā¦ | š :laughing: = :satisfied: |
ā¦ | :D | ā¦ | š | ā¦ | :) | ā¦ | š |
Good documentation on existing replacement patterns in instant messengers and elsewhere is often surprisingly hard to find. While Skype, for instance, does list the ASCII emoticons they support, the mapping is to graphics that do not have a documented (and not always unambiguous) mapping to Unicode.
Skype name | ASCII |
---|---|
Angry | :@ :-@ :=@ x( x-( x=( X( X- (X=( |
Blush | :$ :-$ :=$ :"> |
Cheeky | :P :=P :-P :p :=p :-p |
Cool | 8=) 8-) B=) B-) |
Crying | ;( ;-( ;=( |
Dull | |( |-( |=( |
Evil grin | ]:) >:) |
Kiss | :* :=* :-* |
Laugh | :D :=D :-D :d :=d :-d |
Lips Sealed | :x :-x: X :-X :# :-# :=x :=X :=# |
Nerd | 8-| B-| 8| B| 8=| B=| |
Puke | :& :-& :=& |
Sad | :( :=( :-( |
Sleepy | |-) I-) I=) |
Smile | :) :=) :-) |
Speechless | :| :=| :-| |
Surprised | :o :=o :-o :O :=O :-O |
Sweating | (:| |
Wink | ;) ;-) ;=) |
Wondering | :^) |
Worried | :S :-S :=S :s :-s :=s |
As for Unicode emoji libraries, Gemoji and Twemoji do not have methods to convert from or to ASCII, whereas Emojione seems to adopt random, uncoordinated choices as well.
Code point | Char | Name | Shortname | ASCII | Keywords |
---|---|---|---|---|---|
U+2764 | ❤️ | red heart | :heart: |
<3 | heart |
U+1F494 | 💔 | broken heart | :broken_heart: |
</3 | break, broken |
U+1F44D | 👍 | thumbs up | :thumbsup: :+1: , :thumbup: |
(y) | +1, hand, thumb, up |
U+1F602 | 😂 | face with tears of joy | :joy: |
:') :'-) | face, joy, laugh, tear |
U+1F603 | 😃 | smiling face with open mouth | :smiley: |
:D :-D =D | face, mouth, open, smile |
U+1F605 | 😅 | smiling face with open mouth & cold sweat | :sweat_smile: |
':) ':-) '=) ':D ':-D '=D | cold, face, open, smile, sweat |
U+1F606 | 😆 | smiling face with open mouth & closed eyes | :laughing: , :satisfied: |
>:) >;) >:-) >=) | face, laugh, mouth, open, satisfied, smile |
U+1F607 | 😇 | smiling face with halo | :innocent: |
O:-) 0:-3 0:3 0:-) 0:) 0;^) O:) O;-) O=) 0;-) O:-3 O:3 | angel, face, fairy tale, fantasy, halo, innocent, smile |
U+1F609 | 😉 | winking face | :wink: |
;) ;-) -) ) ;-] ;] ;D ;^) | face, wink |
U+1F60E | 😎 | smiling face with sunglasses | :sunglasses: |
B-) B) 8) 8-) B-D 8-D | bright, cool, eye, eyewear, face, glasses, smile, sun, sunglasses |
U+1F611 | 😑 | expressionless face | :expressionless: |
-_- -__- -___- | expressionless, face, inexpressive, unexpressive |
U+1F613 | 😓 | face with cold sweat | :sweat: |
':( ':-( '=( | cold, face, sweat |
U+1F615 | 😕 | confused face | :confused: |
>:\ >:/ :-/ :-. :/ :\ =/ =\ :L =L | confused, face |
U+1F618 | 😘 | face blowing a kiss | :kissing_heart: |
: :- = :^ | face, kiss |
U+1F61B | 😛 | face with stuck-out tongue | :stuck_out_tongue: |
:P :-P =P :-Þ :Þ :-b :b | face, tongue |
U+1F61C | 😜 | face with stuck-out tongue & winking eye | :stuck_out_tongue_winking_eye: |
>:P X-P | eye, face, joke, tongue, wink |
U+1F61E | 😞 | disappointed face | :disappointed: |
>:[ :-( :( :-[ :[ =( | disappointed, face |
U+1F620 | 😠 | angry face | :angry: |
>:( >:-( :@ | angry, face, mad |
U+1F622 | 😢 | crying face | :cry: |
:'( :'-( ;( ;-( | cry, face, sad, tear |
U+1F623 | 😣 | persevering face | :persevere: |
>.< | face, persevere |
U+1F628 | 😨 | fearful face | :fearful: |
D: | face, fear, fearful, scared |
U+1F62E | 😮 | face with open mouth | :open_mouth: |
:-O :O O_O >:O | face, mouth, open, sympathy |
U+1F633 | 😳 | flushed face | :flushed: |
:$ =$ | dazed, face, flushed |
U+1F635 | 😵 | dizzy face | :dizzy_face: |
#-) #) %-) %) X) X-) | dizzy, face |
U+1F636 | 😶 | face without mouth | :no_mouth: |
:-X :X :-# :# =X =# | face, mouth, quiet, silent |
U+1F642 | 🙂 | slightly smiling face | :slight_smile:, :slightly_smiling_face: |
:) :-) =] =) :] | face, smile |
U+1F646 | 🙆 | person gesturing OK | :person_gesturing_ok: , :ok_woman: |
*\0/* \0/ *\O/* \O/ | OK, gesture, hand |
Kaomoji did evolve into something more like drawing where there is a large number of character sequences that essentially represent the same. For starters, while Westerners will usually type ^^
for one of the best-known and simplest ones, East-Asians regularly end up with ļ¼¾ļ¼¾
(U+FF3E) using full-width forms and others might even input ĖĖ
(U+02C6).
Many apps and libraries support the replacement of ASCII emoticons by Unicode emojis or by proprietary graphics that relate to emojis. They do not all agree, to put it mildly.
Is there some authoritative source for the existing mappings in
build/data_text_toemoji.txt
? Should new ones be added there? Should kaomoji live in the same file?Note that sometimes new Unicode versions bring emojis that are a better match for character line art.
I'd like to establish some conventions:
Hat or Hair or Horns or Forehead
d
orq
>
=
{
8
'
~
=\|
c\|
*<
or*<\|
O
oro
or0
<
or<\|
(
}
]
)
[
Eyes
:
=
8
B
;
p
orb
X
orx
!
?
%
#
+
&
9
6
3
Cheeks
'
or,
"
=
~
Nose
-
*
o
^
(:)
Upper Lip
'
or,
~
{
.
Mouth
)
C
orc
D
O
oro
or0
p
orP
orb
d
orq
(
]
[
}
{
<
-
) or pouting>
/
or\
S
ors
or$
L
*
3
X
B
J
\|
orI
<>
^
V
orv
or\/
#
&
@
()
(\|)
or(I)
{}
{\|}
or{I}
[]
[\|]
or[I]
Neck or Chin
=
3
8
-
~
'
*
)
Horizontal
Mouth and Nose
_
.
3
m
Eyes and Cheeks
.
^
*
@
+
'
Ā°
-
=
>
and<
T
Q
;
:
o
O
or0
Ears, Arms and Accessories
\
and/
d
andb
*\
and/*