GateNLP / gateplugin-Twitter

A suite of tools designed for processing Tweets
GNU Lesser General Public License v3.0
1 stars 0 forks source link

We don't currently support symbol entities #7

Closed greenwoodma closed 5 years ago

greenwoodma commented 5 years ago

Twitter annotates stock symbols (such as $APPL) in tweets in the same way as hashtags etc. but w e both ignore the symbols in the JSON and don't recognise them in plain text treated as a tweet.

JSON Documentation: https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/entities-object#symbols

Regex seems to be \$[a-z]{1,6}([._][a-z]{1,2})? which we could easily convert to JAPE (see https://blog.twitter.com/developer/en_us/a/2013/symbols-entities-tweets.html)