aboSamoor / polyglot

Multilingual text (NLP) processing toolkit
http://polyglot-nlp.com
Other
2.3k stars 337 forks source link

PoS tagger on possessive marker not handled well #177

Open devikasondhi opened 5 years ago

devikasondhi commented 5 years ago

Hello, The PoS tagger does not seem to take into account the presence of possesive ending ('s)

blob = "John's big idea isn't all that bad." text = Text(blob) text.pos_tags [(u"John's", u'NUM'), (u'big', u'ADJ'), (u'idea', u'NOUN'), (u"isn't", u'CONJ'), (u'all', u'DET'), (u'that', u'DET'), (u'bad', u'ADJ'), (u'.', u'PUNCT')] John is a Proper Noun and "'s" is a possesive marker that should have been marked accordingly.