dwadden / dygiepp

Span-based system for named entity, relation, and event extraction.
MIT License
573 stars 120 forks source link

Importing dygiepp models with allennlp python interface #104

Closed serenalotreck closed 2 years ago

serenalotreck commented 2 years ago

I'm interested in getting the vocabulary of the pre-trained models and manipulating it with allennlp. My plan was to use the from_archive method of the Model class to read in the dygiepp models, and then getting the vocabulary via the vocab attribute.

However, when I try to load the model with from_archive, I get a configuration error related to dygie. The error message indicates how to fix this is using allennlp as a command line utility, but doesn't indicate how to fix it within the python interface.

Wondering if there's a fix for this?

Thanks!

Here's the code and the error message:

from allennlp.models.model import Model
import dygie
scierc_model = Model.from_archive('https://s3-us-west-2.amazonaws.com/ai2-s2-research/dygiepp/master/scierc.tar.gz')

---------------------------------------------------------------------------
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mnt/home/lotrecks/anaconda3/envs/dygiepp2/lib/python3.7/site-packages/allennlp/models/model.py", line 411, in from_archive
    model = load_archive(archive_file).model
  File "/mnt/home/lotrecks/anaconda3/envs/dygiepp2/lib/python3.7/site-packages/allennlp/models/archival.py", line 191, in load_archive
    cuda_device=cuda_device,
  File "/mnt/home/lotrecks/anaconda3/envs/dygiepp2/lib/python3.7/site-packages/allennlp/models/model.py", line 360, in load
    model_class: Type[Model] = cls.by_name(model_type)  # type: ignore
  File "/mnt/home/lotrecks/anaconda3/envs/dygiepp2/lib/python3.7/site-packages/allennlp/common/registrable.py", line 137, in by_name
    subclass, constructor = cls.resolve_class_name(name)
  File "/mnt/home/lotrecks/anaconda3/envs/dygiepp2/lib/python3.7/site-packages/allennlp/common/registrable.py", line 185, in resolve_class_name
    f"{name} is not a registered name for {cls.__name__}. "
allennlp.common.checks.ConfigurationError: dygie is not a registered name for Model. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.

conda list output:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             4.5                       1_gnu  
alembic                   1.7.7                    pypi_0    pypi
allennlp                  1.1.0                    pypi_0    pypi
allennlp-models           1.1.0                    pypi_0    pypi
attrs                     21.4.0                   pypi_0    pypi
autopage                  0.5.0                    pypi_0    pypi
beautifulsoup4            4.8.1                    pypi_0    pypi
blis                      0.7.6                    pypi_0    pypi
boto3                     1.21.21                  pypi_0    pypi
botocore                  1.24.21                  pypi_0    pypi
ca-certificates           2022.2.1             h06a4308_0  
cached-property           1.5.2                    pypi_0    pypi
catalogue                 2.0.6                    pypi_0    pypi
certifi                   2021.10.8        py37h06a4308_2  
charset-normalizer        2.0.12                   pypi_0    pypi
click                     8.0.4                    pypi_0    pypi
cliff                     3.10.1                   pypi_0    pypi
cmaes                     0.8.2                    pypi_0    pypi
cmd2                      2.4.0                    pypi_0    pypi
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
colorlog                  6.6.0                    pypi_0    pypi
conllu                    4.1                      pypi_0    pypi
cymem                     2.0.6                    pypi_0    pypi
en-core-sci-sm            0.5.0                    pypi_0    pypi
filelock                  3.0.12                   pypi_0    pypi
ftfy                      6.1.1                    pypi_0    pypi
future                    0.18.2                   pypi_0    pypi
greenlet                  1.1.2                    pypi_0    pypi
h5py                      3.6.0                    pypi_0    pypi
idna                      3.3                      pypi_0    pypi
importlib-metadata        4.11.3                   pypi_0    pypi
importlib-resources       5.4.0                    pypi_0    pypi
iniconfig                 1.1.1                    pypi_0    pypi
jinja2                    3.0.3                    pypi_0    pypi
jmespath                  1.0.0                    pypi_0    pypi
joblib                    1.1.0                    pypi_0    pypi
jsonlines                 2.0.0              pyhd3eb1b0_2  
jsonnet                   0.18.0                   pypi_0    pypi
jsonpickle                2.1.0                    pypi_0    pypi
langcodes                 3.3.0                    pypi_0    pypi
ld_impl_linux-64          2.35.1               h7274673_9  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 9.3.0               h5101ec6_17  
libgomp                   9.3.0               h5101ec6_17  
libstdcxx-ng              9.3.0               hd4cf53a_17  
lxml                      4.8.0                    pypi_0    pypi
mako                      1.2.0                    pypi_0    pypi
markupsafe                2.1.1                    pypi_0    pypi
murmurhash                1.0.6                    pypi_0    pypi
ncurses                   6.3                  h7f8727e_2  
nltk                      3.7                      pypi_0    pypi
numpy                     1.21.5                   pypi_0    pypi
openssl                   1.1.1m               h7f8727e_0  
optuna                    2.10.0                   pypi_0    pypi
overrides                 3.1.0                    pypi_0    pypi
packaging                 21.3                     pypi_0    pypi
pandas                    0.25.2                   pypi_0    pypi
pathy                     0.6.1                    pypi_0    pypi
pbr                       5.8.1                    pypi_0    pypi
pip                       21.2.2           py37h06a4308_0  
plac                      1.1.3                    pypi_0    pypi
pluggy                    1.0.0                    pypi_0    pypi
preshed                   3.0.6                    pypi_0    pypi
prettytable               3.2.0                    pypi_0    pypi
protobuf                  3.19.4                   pypi_0    pypi
py                        1.11.0                   pypi_0    pypi
py-rouge                  1.1                      pypi_0    pypi
pydantic                  1.8.2                    pypi_0    pypi
pyparsing                 3.0.7                    pypi_0    pypi
pyperclip                 1.8.2                    pypi_0    pypi
pytest                    7.1.0                    pypi_0    pypi
python                    3.7.11               h12debd9_0  
python-dateutil           2.8.2                    pypi_0    pypi
python-levenshtein        0.12.2                   pypi_0    pypi
python_abi                3.7                     2_cp37m    conda-forge
pytz                      2021.3                   pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
readline                  8.1.2                h7f8727e_1  
regex                     2022.3.15                pypi_0    pypi
requests                  2.27.1                   pypi_0    pypi
s3transfer                0.5.2                    pypi_0    pypi
sacremoses                0.0.49                   pypi_0    pypi
scikit-learn              1.0.2                    pypi_0    pypi
scipy                     1.7.3                    pypi_0    pypi
sentencepiece             0.1.96                   pypi_0    pypi
setuptools                58.0.4           py37h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
smart-open                5.2.1                    pypi_0    pypi
soupsieve                 2.3.1                    pypi_0    pypi
spacy                     3.2.3                    pypi_0    pypi
spacy-legacy              3.0.9                    pypi_0    pypi
spacy-loggers             1.0.1                    pypi_0    pypi
sqlalchemy                1.4.32                   pypi_0    pypi
sqlite                    3.38.0               hc218d9a_0  
srsly                     2.4.2                    pypi_0    pypi
stevedore                 3.5.0                    pypi_0    pypi
tensorboardx              2.5                      pypi_0    pypi
thinc                     8.0.15                   pypi_0    pypi
threadpoolctl             3.1.0                    pypi_0    pypi
tk                        8.6.11               h1ccaba5_0  
tokenizers                0.8.1rc1                 pypi_0    pypi
tomli                     2.0.1                    pypi_0    pypi
torch                     1.6.0                    pypi_0    pypi
tqdm                      4.63.0             pyhd8ed1ab_0    conda-forge
transformers              3.0.2                    pypi_0    pypi
typer                     0.4.0                    pypi_0    pypi
typing-extensions         3.10.0.2                 pypi_0    pypi
urllib3                   1.26.9                   pypi_0    pypi
wasabi                    0.9.0                    pypi_0    pypi
wcwidth                   0.2.5                    pypi_0    pypi
wheel                     0.37.1             pyhd3eb1b0_0  
word2number               1.1                      pypi_0    pypi
xz                        5.2.5                h7b6447c_0  
zipp                      3.7.0                    pypi_0    pypi
zlib                      1.2.11               h7f8727e_4  
dwadden commented 2 years ago

Hi Serena,

Unfortunately, i don't know how to do this. If you're willing to dig into the AllenNLP source code, maybe you can find out what's happening under the hood with the --include_package command line flag, and replicate that behavior in your Python script? Sorry I can't help more.

Dave

serenalotreck commented 2 years ago

No worries, just wanted to make sure there wasn't a quick answer I was missing!

I'll close for now and update when I figure out how to do it.

Thanks!