pento / x1f4a9

Twitter Emoji for WordPress
https://wordpress.org/plugins/x1f4a9/
8 stars 4 forks source link

Backwards compatibility with non-emojified character sets #1

Closed pento closed 9 years ago

pento commented 9 years ago

Some character sets in MySQL don't support emoji. It's a fairly sad state of affairs. If we change the emoji to their HTML encoded version, we can make everyone happy!

Here's a function I wrote to do it:

/**
 * Convert any 4 byte emoji in a string to their equivalent HTML entitiy.
 *
 * This allows us to store emoji in a DB using the utf8 character set.
 *
 * @since 4.2.0
 * @param  string $content The content to encode
 * @return string The encoded content
 */
function wp_encode_emoji( $content ) {
    if ( function_exists( 'mb_convert_encoding' ) ) {
        $regex = '/(
              \x23\xE2\x83\xA3               # Digits
              [\x30-\x39]\xE2\x83\xA3
            | \xF0\x9F[\x85-\x88][\xB0-\xBF] # Enclosed characters
            | \xF0\x9F[\x8C-\x97][\x80-\xBF] # Misc
            | \xF0\x9F\x98[\x80-\xBF]        # Smilies
            | \xF0\x9F\x99[\x80-\x8F]
            | \xF0\x9F\x9A[\x80-\xBF]        # Transport and map symbols
            | \xF0\x9F\x99[\x80-\x85]
        )/x';
        $matches = array();
        if ( preg_match_all( $regex, $content, $matches ) ) {
            if ( ! empty( $matches[1] ) ) {
                foreach( $matches[1] as $emoji ) {
                    $unpacked = unpack( 'H*', mb_convert_encoding( $emoji, 'UTF-32', 'UTF-8' ) );
                    if ( isset( $unpacked[1] ) ) {
                        $entity = '&#x' . trim( $unpacked[1], '0' ) . ';';
                        $content = str_replace( $emoji, $entity, $content );
                    }
                }
            }
        }
    }

    return $content;
}
pento commented 9 years ago

There's currently no hook to apply this function to options. We may add an appropriate hook, pending discussion on this ticket.

In the mean time, this ticket is a good place to discuss any other fields that should support HTML encoded emoji. The only rule is that the field must be used in HTML - I have no desire to decode the emoji at some later point.

pento commented 9 years ago

I'm inclined to only add support to blog name and blog title. We can add others in the future, if there's demand.

pento commented 9 years ago

I'm fine with this. Let's do it.