algolia / we-rate-tweets

A colorful timeline search that puts your top tweets first
https://we-rate-tweets.glitch.me/
4 stars 4 forks source link

Fix emoji display issues #3

Closed joshed-io closed 7 years ago

joshed-io commented 7 years ago

We get some weird unicode characters in the results, from emojis that were indexed. ���

The best thing to do for now might be to remove emojis from the tweets at indexing time. Anyway, it will make the rating emoji stand out :)

JessicaG commented 7 years ago

We could use this: https://www.npmjs.com/package/emoji-strip

Haroenv commented 7 years ago

only strip the emoji that are 2+ bytes, other ones work perfectly with Algolia

joshed-io commented 7 years ago

Thanks for the tip Mr. @Haroenv! So then... how to detect those chars... is there a regex that matches >2 byte chars?

Haroenv commented 7 years ago

https://github.com/mathiasbynens/emoji-regex and strip those w length over 2

joshed-io commented 7 years ago

Stripping all of them for right now using that regex, but welcome to a better solution for preserving the single-byte