thisandagain / sentiment

AFINN-based sentiment analysis for Node.js.
MIT License
2.64k stars 310 forks source link

Allow tokenizer to split emoji without space delimiter #113

Open thisandagain opened 7 years ago

thisandagain commented 7 years ago

PR #74 initially incorporated a change which used spliddit's hasPair method to split emoji without a "space" or other delimiter. While useful, this change had fairly serious performance consequences. I think it's worthwhile to include this functionality, but it should be optional.

thorbert commented 7 years ago

I think the mod in PR #74 needs a better implementation anyway. We won't only want to split on surrogate pairs. Sometimes we'll probably want to split when there isn't a pair.