BLLIP / bllip-parser

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
http://bllip.cs.brown.edu/
227 stars 53 forks source link

terminate called after throwing an instance of 'swig::stop_iteration' #42

Closed nahgnaw closed 8 years ago

nahgnaw commented 8 years ago

I ran into the following issue when trying to run the parser. Any idea what might cause this?

>>> from bllipparser import RerankingParser
>>> rrp = RerankingParser.fetch_and_load('WSJ-PTB3', verbose=True)
>>> rrp.simple_parse("It's that easy.")
terminate called after throwing an instance of 'swig::stop_iteration'
Aborted

Thanks!

dmcc commented 8 years ago

Thanks for the report! A couple questions: Are you using the latest version from GitHub? If so, which version of swig are you running?

Also, are you using Python 2 or 3?

nahgnaw commented 8 years ago

I tried to install bllipparser using pip and with the latest version both from pypi (version 2015/08/18) and github (version 2015/10/15), but neither of them worked on my machine (Ubuntu, Python 2.7.6, swig 3.0.8) because of the above swig::stop_iteration error.

I also compiled from the source using the github version, and it worked with parse.sh. However, I'm using another tool (https://github.com/Juicechuan/AMRParsing) which installs bllipparser using pip.

Any suggestion how I can make this work?

dmcc commented 8 years ago

I'm using an older swig (2.0.12) on Ubuntu 14.10 with Python 2.7.8, so those are some potential differences. But, I'm wondering if SWIG isn't even the problem and the real problem is a parse failure (simple_parse might be trying to index into an empty list -- it should have some better checking). What happens when you try the following?

>>> from bllipparser import RerankingParser
>>> rrp = RerankingParser.fetch_and_load('WSJ-PTB3', verbose=True)
>>> nbest_list = rrp.parse("It's that easy.")
>>> len(nbest_list)
50

If it doesn't match (len(nbest_list) might be 0), I'm also curious if it doesn't match when fetch_and_loading other parsing models as well (try WSJ).

nahgnaw commented 8 years ago

I got the same terminate called after throwing an instance of 'swig::stop_iteration' Aborted error when using parse method.

dmcc commented 8 years ago

I upgraded to swig version 3.0.2 but I'm still not getting the error. It looks like version 3.0.7 is the latest stable release for swig -- just to rule things out, could you try rebuilding with swig 3.0.7 (or earlier)? (may need a make real-clean first)

If this doesn't help, it might be helpful to post your installed bllipparser egg directory so I can see how it differs from mine.

nahgnaw commented 8 years ago

I reinstalled swig 3.0.2, but I got the same error...

In my /usr/local/lib/python2.7/dist-packages/bllipparser directory, I have the following files:

52K -rw-r--r--  1 root staff  50K 2015-10-19 22:50 CharniakParser.py
108K -rw-r--r--  1 root staff 108K 2015-10-19 22:51 CharniakParser.pyc
7.9M -rwxr-xr-x  1 root staff 7.9M 2015-10-19 22:51 _CharniakParser.so
 16K -rw-r--r--  1 root staff  16K 2015-10-19 22:50 __init__.py
 16K -rw-r--r--  1 root staff  16K 2015-10-19 22:51 __init__.pyc
 12K -rw-r--r--  1 root staff 9.3K 2015-10-19 22:50 JohnsonReranker.py
 20K -rw-r--r--  1 root staff  19K 2015-10-19 22:51 JohnsonReranker.pyc
4.0M -rwxr-xr-x  1 root staff 4.0M 2015-10-19 22:51 _JohnsonReranker.so
4.0K -rw-r--r--  1 root staff  620 2015-10-19 22:50 __main__.py
4.0K -rw-r--r--  1 root staff  258 2015-10-19 22:51 __main__.pyc
8.0K -rw-r--r--  1 root staff 7.8K 2015-10-19 22:50 ModelFetcher.py
8.0K -rw-r--r--  1 root staff 7.9K 2015-10-19 22:51 ModelFetcher.pyc
 16K -rw-r--r--  1 root staff  13K 2015-10-19 22:50 ParsingShell.py
 12K -rw-r--r--  1 root staff  12K 2015-10-19 22:51 ParsingShell.pyc
 16K -rw-r--r--  1 root staff  14K 2015-10-19 22:50 RerankerFeatureCorpus.py
 20K -rw-r--r--  1 root staff  17K 2015-10-19 22:51 RerankerFeatureCorpus.pyc
 44K -rw-r--r--  1 root staff  42K 2015-10-19 22:50 RerankingParser.py
 44K -rw-r--r--  1 root staff  42K 2015-10-19 22:51 RerankingParser.pyc
4.0K -rw-r--r--  1 root staff 2.2K 2015-10-19 22:50 Utility.py
4.0K -rw-r--r--  1 root staff 2.5K 2015-10-19 22:51 Utility.pc

In my /usr/local/lib/python2.7/dist-packages/bllipparser-2015.8.18-py2.7.egg-info directory, I have the following files:

4.0K -rw-r--r--  1 root staff    1 2015-10-19 22:51 dependency_links.txt
4.0K -rw-r--r--  1 root staff  703 2015-10-19 22:51 installed-files.txt
 20K -rw-r--r--  1 root staff  19K 2015-10-19 22:51 PKG-INFO
4.0K -rw-r--r--  1 root staff 1.8K 2015-10-19 22:51 SOURCES.txt
4.0K -rw-r--r--  1 root staff   12 2015-10-19 22:51 top_level.txt

Are these the directories you were talking about?

dmcc commented 8 years ago

Yes, those are the files. If you make a tar of both of those directories, I can compare them against other installs and hopefully figure out what's causing this a little better. Thanks and sorry for all the trouble!

nahgnaw commented 8 years ago

I sent those files to your Stanford email. Thank you for helping me with this!

dmcc commented 8 years ago

Thanks @nahgnaw, I got the files! I compared them against an install of bllipparser version 2015.08.18. The *.py files all match (binaries don't but that likely stems from different swig/compiler versions). Couldn't see anything obviously weird about the symbols in the .so files.

When I forced Python to use your bllipparser package, it didn't crash -- I think this means we can rule out the build system as the problem. Is it possible that you have multiple versions of bllipparser installed and they're getting scrambled? Or, maybe when I switched setup.py from distutils to setuptools it caused some problems.

One way to test is to make a virtual environment and install bllipparser there -- should isolate it from the rest of the installation. Please let me know if that helps. Sorry again for all the trouble...

dmcc commented 8 years ago

Actually, there's one more sanity check which might be slightly easier than reinstalling bllipparser in a virtualenv -- python -v will trace imports. Tracing the example code might reveal if it's mixing files from multiple bllipparser versions.

nahgnaw commented 8 years ago

I don't think I have multiple versions of bllipparser installed. Below is the run trace with python -v:

>>> from bllipparser import RerankingParser
import bllipparser # directory /usr/local/lib/python2.7/dist-packages/bllipparser
# /usr/local/lib/python2.7/dist-packages/bllipparser/__init__.pyc matches /usr/local/lib/python2.7/dist-packages/bllipparser/__init__.py
import bllipparser # precompiled from /usr/local/lib/python2.7/dist-packages/bllipparser/__init__.pyc
# /usr/local/lib/python2.7/dist-packages/bllipparser/RerankingParser.pyc matches /usr/local/lib/python2.7/dist-packages/bllipparser/RerankingParser.py
import bllipparser.RerankingParser # precompiled from /usr/local/lib/python2.7/dist-packages/bllipparser/RerankingParser.pyc
# /usr/local/lib/python2.7/dist-packages/bllipparser/CharniakParser.pyc matches /usr/local/lib/python2.7/dist-packages/bllipparser/CharniakParser.py
import bllipparser.CharniakParser # precompiled from /usr/local/lib/python2.7/dist-packages/bllipparser/CharniakParser.pyc
import imp # builtin
dlopen("/usr/local/lib/python2.7/dist-packages/bllipparser/_CharniakParser.so", 2);
import _CharniakParser # dynamically loaded from /usr/local/lib/python2.7/dist-packages/bllipparser/_CharniakParser.so
# /usr/local/lib/python2.7/dist-packages/bllipparser/JohnsonReranker.pyc matches /usr/local/lib/python2.7/dist-packages/bllipparser/JohnsonReranker.py
import bllipparser.JohnsonReranker # precompiled from /usr/local/lib/python2.7/dist-packages/bllipparser/JohnsonReranker.pyc
dlopen("/usr/local/lib/python2.7/dist-packages/bllipparser/_JohnsonReranker.so", 2);
import _JohnsonReranker # dynamically loaded from /usr/local/lib/python2.7/dist-packages/bllipparser/_JohnsonReranker.so
# /usr/local/lib/python2.7/dist-packages/bllipparser/Utility.pyc matches /usr/local/lib/python2.7/dist-packages/bllipparser/Utility.py
import bllipparser.Utility # precompiled from /usr/local/lib/python2.7/dist-packages/bllipparser/Utility.pyc
import math # builtin
>>> rrp = RerankingParser.fetch_and_load('WSJ-PTB3', verbose=True)
# /usr/local/lib/python2.7/dist-packages/bllipparser/ModelFetcher.pyc matches /usr/local/lib/python2.7/dist-packages/bllipparser/ModelFetcher.py
import bllipparser.ModelFetcher # precompiled from /usr/local/lib/python2.7/dist-packages/bllipparser/ModelFetcher.pyc
# /usr/lib/python2.7/__future__.pyc matches /usr/lib/python2.7/__future__.py
import __future__ # precompiled from /usr/lib/python2.7/__future__.pyc
# /usr/lib/python2.7/urlparse.pyc matches /usr/lib/python2.7/urlparse.py
import urlparse # precompiled from /usr/lib/python2.7/urlparse.pyc
# /usr/lib/python2.7/collections.pyc matches /usr/lib/python2.7/collections.py
import collections # precompiled from /usr/lib/python2.7/collections.pyc
import _collections # builtin
import operator # builtin
# /usr/lib/python2.7/keyword.pyc matches /usr/lib/python2.7/keyword.py
import keyword # precompiled from /usr/lib/python2.7/keyword.pyc
# /usr/lib/python2.7/heapq.pyc matches /usr/lib/python2.7/heapq.py
import heapq # precompiled from /usr/lib/python2.7/heapq.pyc
import itertools # builtin
import _heapq # builtin
import thread # builtin
# /usr/lib/python2.7/urllib.pyc matches /usr/lib/python2.7/urllib.py
import urllib # precompiled from /usr/lib/python2.7/urllib.pyc
# /usr/lib/python2.7/string.pyc matches /usr/lib/python2.7/string.py
import string # precompiled from /usr/lib/python2.7/string.pyc
import strop # builtin
# /usr/lib/python2.7/socket.pyc matches /usr/lib/python2.7/socket.py
import socket # precompiled from /usr/lib/python2.7/socket.pyc
import _socket # builtin
# /usr/lib/python2.7/functools.pyc matches /usr/lib/python2.7/functools.py
import functools # precompiled from /usr/lib/python2.7/functools.pyc
import _functools # builtin
import _ssl # builtin
import cStringIO # builtin
import time # builtin
# /usr/lib/python2.7/base64.pyc matches /usr/lib/python2.7/base64.py
import base64 # precompiled from /usr/lib/python2.7/base64.pyc
# /usr/lib/python2.7/struct.pyc matches /usr/lib/python2.7/struct.py
import struct # precompiled from /usr/lib/python2.7/struct.pyc
import _struct # builtin
import binascii # builtin
# /usr/lib/python2.7/ssl.pyc matches /usr/lib/python2.7/ssl.py
import ssl # precompiled from /usr/lib/python2.7/ssl.pyc
# /usr/lib/python2.7/textwrap.pyc matches /usr/lib/python2.7/textwrap.py
import textwrap # precompiled from /usr/lib/python2.7/textwrap.pyc
Model directory: /home/nahgnaw/.local/share/bllipparser/WSJ-PTB3
Model directory already exists, not reinstalling
>>> rrp.simple_parse("It's that easy.")
terminate called after throwing an instance of 'swig::stop_iteration'
Aborted
dmcc commented 8 years ago

Well, I'm afraid I still can't say I have a good understanding of what's going on here.

If you're up for more remote debugging, we could try to see how far it gets in the Python code before it crashes. If you install the hunter module from Python, the output from this might help (might be quite long):

import bllipparser, hunter
rrp = bllipparser.RerankingParser.fetch_and_load('WSJ-PTB3')
hunter.trace(module='bllipparser.RerankingParser')
rrp.simple_parse("It's that easy.")

By the way, have you tried other sentences and/or disabling the reranker?

nahgnaw commented 8 years ago

@dmcc, I switched to another Ubuntu machine and got it working quite easily. I guess there must be something wrong with the swig on the old machine. Anyway, thanks a lot for your help. I think I can close this issue now.