mlomb / chat-analytics

Generate interactive, beautiful and insightful chat analysis reports
https://chatanalytics.app
GNU Affero General Public License v3.0
638 stars 47 forks source link

Handle Unicode escape sequences for emojis and symbols #115

Open ShortTimeNoSee opened 1 month ago

ShortTimeNoSee commented 1 month ago

Currently when a data export has emojis that are in the form of Unicode escape sequences (e.g., \u00f0\u009f\u0098\u00ad), it is not handled correctly on analysis and causes incorrect statistics for emoji data.

ShortTimeNoSee commented 1 month ago

There is also an issue with symbols

image

This returns as "donâ" when it's supposed to be "don't" (except their apostrophe is the curvy type, so "don’t"). So we get stuff like this as a result lol:

image