iamcal / emoji-data

Easy to parse data and spritesheets for emoji
MIT License
2.56k stars 301 forks source link

Fix invalid utf8 json encode failure #196

Closed jwheare closed 3 years ago

jwheare commented 3 years ago

I noticed an issue while running the build script that the rescue worker’s helmet line parsed from emoji-sequences.txt was resulting in invalid utf8 and a failed json_encode (leading to empty emoji.json and pretty files)

This fixes that by using mb_strtoupper instead of StrToUpper in build_map.php. The specific line that fixed it was:

https://github.com/iamcal/emoji-data/blob/045732a02b2efbc3ee24ce36d598caba28f7f909/build/build_map.php#L767

Also added error logging for bad json_encodes and prevent blanking the json files if that happens.

iamcal commented 3 years ago

thanks for the catch and fix. i'm not getting the same result while building, which makes me suspect this was a PHP 7 change

jwheare commented 3 years ago

Ah quite possibly. FWIW:

php -v
PHP 7.3.11 (cli) (built: Jun  5 2020 23:50:40) ( NTS )
iamcal commented 3 years ago
$ php -v
PHP 7.4.3 (cli) (built: Oct  6 2020 15:47:56) ( NTS )

I can't find anything in the changelog that suggests why this is. Might be locale?

$ php -r 'echo setlocale(LC_CTYPE, 0)."\n";'
en_US.UTF-8
jwheare commented 3 years ago

en_GB.UTF-8

iamcal commented 3 years ago

hmm - no idea then ¯\(ツ)