There are new models available on their website that we might want to integrate:
model.20120919 (2MB) -- the Twitter POS model with our coarse 25-tag tagset.
This is included with the tagger release and used by default.
model.ritter_ptb_alldata_fixed.20130723 (1.5 MB) -- a model that gives a Penn
Treebank-style tagset for Twitter. Trained from a fixed version of Ritter et
al. EMNLP 2011's annotated data. If you want PTB-style POS tags for Twitter,
use this model. We documented issues and changes here. Also, here is an
accuracy evaluation to compare with other work.
model.irc.20121211 (3MB) -- a model trained on the NPSChat IRC corpus, with a
PTB-style tagset.
Original issue reported on code.google.com by richard.eckart on 18 Sep 2013 at 3:47
Original issue reported on code.google.com by
richard.eckart
on 18 Sep 2013 at 3:47