Closed thoppe closed 7 years ago
Trying to narrow it down, a much smaller single threaded traceback is
File "/usr/local/lib/python2.7/dist-packages/nlpre/separated_parenthesis.py", line 92, in __call__
text = self.paren_pop(tokens)
File "/usr/local/lib/python2.7/dist-packages/nlpre/separated_parenthesis.py", line 110, in paren_pop
content = self.paren_pop_helper(parsed_tokens)
File "/usr/local/lib/python2.7/dist-packages/nlpre/separated_parenthesis.py", line 146, in paren_pop_helper
sents = self.paren_pop_helper(tokens)
File "/usr/local/lib/python2.7/dist-packages/nlpre/separated_parenthesis.py", line 136, in paren_pop_helper
if token_words[-1] not in ['.', '!', '?']:
IndexError: list index out of range
Found it. Here is a MWE of the error. It's pathological, but it came up in real world data (and absolutely shouldn't crash the program!)
import nlpre
doc = '''[[[ ]]]'''
nlpre.separated_parenthesis()(doc)
This is a large traceback, but it's unfortunately all we get when running in parallel. It looks like the input to the function is truncated a bit too so it's hard to tell what's going in.