venomous0x / WhatsAPI

Interface to WhatsApp Messenger
2.59k stars 2.14k forks source link

Emoji decode when recieve a message #373

Open Terrorhawk opened 11 years ago

Terrorhawk commented 11 years ago

Does anyone know how to filter the emoji in a recieving message?

i'm creating a messageboard with this api to display recieved messages on a big screen. but the problem is that i get those special char that supose to be a emoji. does anyone have a function so that i can filter them out and display a emoij from a gif file (css or so)?

shirioko commented 11 years ago

I already have the codes and the full size sprite image, just need a clone of myself to make some time to implement this..

em66

$emojis = array(
                        "‼",
                        "⁉",
                        "ℹ",
                        "↔",
                        "↕",
                        "↩",
                        "↪",
                        "⌚",
                        "⌛",
                        ...
                        "\ud83d\udeb4",
                        "\ud83d\udeb5",
                        "\ud83d\udeb7",
                        "\ud83d\udeb8",
                        "\ud83d\udebf",
                        "\ud83d\udec1",
                        "\ud83d\udec2",
                        "\ud83d\udec3",
                        "\ud83d\udec4",
                        "\ud83d\udec5"
}
jonnywilliamson commented 11 years ago

I'm on my phone at the moment so I can't tell correctly but is that the full emoji sprite you're using @shirioko?

If so I think you're missing loads of them. Check out my last commits to get another sprite. Hope that helps.

shirioko commented 11 years ago

I think so too, I remember having a lot more emojis on my Android device than on WP7 which I'm using right now.

Terrorhawk commented 11 years ago

i think its per device.

http://i.imgur.com/2fDqO.jpg

Here is a link to the emoji of ios6

koenk commented 11 years ago

The emoji are actually part of the unicode standard, although most platforms only support a small selection of them (https://en.wikipedia.org/wiki/Emoji). You can easily extract all .png files (48x48 px) from the WhatsApp apk, their filename corresponds to the character code (e.g. e537.png). The only problem is copyright stuff...

Terrorhawk commented 11 years ago

what i dont get is:

i recieve this (well it printen in the shell) 

this sould be the pile of poo emoji (1F4A9) But how to get from those 2 chart to that Unified code ?

koenk commented 11 years ago

Encodings. Unicode is often represented as UTF-8. In plain old ascii, every byte was a character. However, to support more than 255 characters, UTF-8 was introduced which combines multiple bytes into a single character. If you take a string (which is nothing more than a bunch of bytes) and apply the wrong encoding, such as any ASCII based encoding (latin-1, ISO-8859-1) instead of UTF-8, you'll see every byte as an individual character.

The 'pile of poo' emoji has the code e05a here (python interactive session for demonstration):

>>> u""
u'\ue05a'
>>> u"".encode('utf-8')
'\xee\x81\x9a'
>>> print "\xee"
î
>>> print "\x81"
 (note: this seems to be an invalid character)
>>> print "\x9a"
š

And when I look up e05a.png (extracted from apk) it is the pile of poo as expected. Anyway, you should use the correct encoding, but I have absolutely no idea how PHP works with that. If you see those characters in your browser the page should be set to utf-8 too, I think... I tend to avoid PHP and webdev as much as possible ;)