Closed mallihee closed 9 years ago
Thanks for the report. I was able to reproduce this with some (but not all) parsing models. The bug comes down to the fact that some parsing models can't parse the sentence "where" (but for technical reasons, they can parse "where is"). The tag()
function simply parses text as if it was a full sentence and rips the tags off the first tree (you'll get the best accuracy by giving it complete sentences). I've made it so that the tag()
method will fall back to the most frequent POS tags in these cases and added a flag (allow_failures=True
) to disable this behavior (= get an error when the parse fails in case that should be handled differently).
In [3]: rrp.tag('where')
IndexError Traceback (most recent call last) /home/DataSet/ in ()
----> 1 rrp.tag('where')
/usr/local/lib/python2.7/dist-packages/bllipparser/RerankingParser.py in tag(self, text_or_tokens) 538 text_or_tokens can be either a string or a sequence of tokens.""" 539 parses = self.parse(text_or_tokens) --> 540 return parses[0].ptb_parse.tokens_and_tags() 541 542 def _find_bad_tag_and_raise_error(self, tags):
IndexError: list index out of range
However, When I tag a phrase like ('where is'), I got the result:
In [5]: rrp.tag('where is') Out[5]: [('where', 'WRB'), ('is', 'VBZ')]