cbaziotis / ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
MIT License
660 stars 91 forks source link

if the hashtag or word is in all caps, lower it #23

Open mingfengwan opened 4 years ago

mingfengwan commented 4 years ago

Example: "IMUSTGO" to "imustgo" Allows the current segmenter to split hashtags that are in all caps.