emorynlp / nlp4j-old

NLP tools developed by Emory University.
Other
60 stars 19 forks source link

Errors in POS Tagging #6

Closed henryYHC closed 8 years ago

henryYHC commented 8 years ago

For the following cases,

    1   You you PRP _   2   nsubj   _
    2   know    know    VBP _   7   parataxis   _   O
    3   what    what    WP  _   2   ccomp   _   O
    4   ,   ,   ,   _   7   punct   _   O
    5   you you PRP _   7   nsubj   _
    6   ’ve   ’ve   NNP _   7   nsubj   _   U-PERSON
    7   convinced   convince    VBD _   0   root    _   O
    8   me  me  PRP _   7   dobj    _
    9   ,   ,   ,   _   7   punct   _   O
    10  maybe   maybe   RB  _   14  advmod  _   O
    11  tonight tonight NN  _   14  npadvmod    _   U-TIME
    12  we  we  PRP _   14  nsubj   _   O
    13  should  should  MD  _   14  aux _   O
    14  sneak   sneak   VB  _   7   ccomp   _   O
    15  in  in  RP  _   14  prt _   O
    16  and and CC  _   14  cc  _   O
    17  shampoo shampoo VB  _   14  conj    _   O
    18  her her PRP$    _   19  poss    _
    19  carpet  carpet  NN  _   17  dobj    _   O
    20  .   .   .   _   7   punct   _   O

    1   You you PRP _   4   nsubj   _
    2   do  do  VBP _   4   aux _   O
    3   n’t   n’t   PRP _   4   nsubj   _
    4   think   think   VB  _   0   root    _   O
    5   that    that    IN  _   6   nsubj   _   O
    6   crosses cross   VBZ _   4   ccomp   _   O
    7   a   a   DT  _   8   det _   O
    8   line    line    NN  _   6   dobj    _   O
    9   ?   ?   .   _   4   punct   _   O

The token 've and n't seems to have the wrong pos tag. Recurring error with conversational data.

jdchoi77 commented 8 years ago

This is caused by the smart-quote, which should be handled correctly from the version 1.1.1. Thanks.

best,

Jinho