GateNLP / gateplugin-Twitter

A suite of tools designed for processing Tweets
GNU Lesser General Public License v3.0
1 stars 0 forks source link

Token offsets don't align with @mention #3

Closed greenwoodma closed 5 years ago

greenwoodma commented 5 years ago

If someone tries to use an @mention like a normal word with an apostrophy ending the tokenization gets messed up.

For example, @sadiqkhan't should have a token over the @sadiqkhan portion, but the tokens miss off the last n as that becomes part of a token for n't