VitorLuizC / normalize-text

šŸ“ Provides a simple functions to normalize texts, whitespaces, paragraphs & diacritics.
MIT License
63 stars 6 forks source link

Emoji Normalization Feature #22

Open mechamobau opened 10 months ago

mechamobau commented 10 months ago

I was thinking about this lib and there's a growing need to handle emojis effectively in text normalization. This feature would convert emojis into their corresponding textual descriptions, making the text more comprehensible and analyzable, especially when processing social media content or informal communications.

Use Case: Often, emojis are used in texts to convey emotions or actions that are not captured by plain text. Normalizing these into words can aid in sentiment analysis, text-to-speech applications, and in contexts where emojis are not supported or are less meaningful.

Implementation Idea: We could create a mapping of commonly used emojis to their respective descriptive phrases. The normalization function should then detect these emojis in the text and replace them with the mapped phrases.

It's possible to use Gitmoji project as reference, because their project has the list with all emoji and codes that is possible to use in commit messages, and this feature can adapt with it's own context (e.g they have :bug: as emoji for commits that solves bugs, maybe :insect: or something like that can be used in the place), and Github has it's own text-to-emoji cheatsheet too

Potential Challenges:

Benefits:

I believe this feature would be a valuable addition to the 'normalize-text' project, helping people that want to support apps that receives emoji codes and handles the emoji as needed.

mechamobau commented 10 months ago

What I was thinking about the API is something like that:

normalizeEmoji(`Seek knowledge :alien::rocket:`) // "Seek knowledge šŸ‘½šŸš€"

This is merely a suggestion and might not align perfectly with the project's objectives