GateNLP / gateplugin-Twitter

A suite of tools designed for processing Tweets
GNU Lesser General Public License v3.0
1 stars 0 forks source link

Should we process flags? #1

Open drj11 opened 5 years ago

drj11 commented 5 years ago

Flags, as unicode composites, are getting quite popular on Twitter, and might be useful to recognise as national or linguistic entities.

Example: https://twitter.com/wimlds/status/1135560925149351936

ianroberts commented 5 years ago

A very simple start on this could be to add a gazetteer of the standard flag unicode sequences mapped to their ISO codes.

greenwoodma commented 5 years ago

Turns out flags are actually a lot more complex than I thought: https://blog.emojipedia.org/emoji-flags-explained/

Should be easy enough to cover the basic cases either with a gazetteer or with some JAPE to find all valid combinations so we can mark them as a flag even if we can't determine what it represents.