dwadden / dygiepp

Span-based system for named entity, relation, and event extraction.
MIT License
569 stars 120 forks source link

FileNotFoundError: file /var/folders/15/gkjwslh904x25gqrgyy07ndw0000gn/T/tmp85q5ara4/vocabulary not found #122

Open alex1xu opened 11 months ago

alex1xu commented 11 months ago

Hi, sorry if this has already been addressed elsewhere, but I didn't see anything related.

I am trying to use the joint entity and relation extraction DYGIE++ model released by the following paper: https://physionet.org/content/radgraph/1.0.0/ Their model.tar.gz file and folder of schema vocabulary (in the form of short text files with one entity/relation class in each line) is available for download.

My goal is to use the model to predict entities and relations on new data. I ran the following command after following the steps outlined in the dependencies section:

allennlp predict some_path/model.tar.gz \
    some_path/input_data \
    --predictor dygie \
    --include-package dygie \
    --use-dataset-reader \
    --output-file some_path/output.json

Here input_data is a folder containing a bunch of text files. Output.json is an empty json file. When i execute the command, i get the following output:

INFO - allennlp.common.plugins - Plugin allennlp_models available INFO - allennlp.models.archival - loading archive file model_checkpoint/model.tar.gz INFO - allennlp.models.archival - extracting archive file model_checkpoint/model.tar.gz to temp dir /var/folders/15/gkjwslh904x25gqrgyy07ndw0000gn/T/tmp85q5ara4 INFO - allennlp.common.params - type = from_instances INFO - allennlp.data.vocabulary - Loading token dictionary from /var/folders/15/gkjwslh904x25gqrgyy07ndw0000gn/T/tmp85q5ara4/vocabulary. Traceback (most recent call last): File "/Users/ax/opt/anaconda3/envs/new_env/bin/allennlp", line 8, in sys.exit(run()) File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/main.py", line 34, in run main(prog="allennlp") File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/commands/init.py", line 92, in main args.func(args) File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/commands/predict.py", line 211, in _predict predictor = _get_predictor(args) File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/commands/predict.py", line 110, in _get_predictor overrides=args.overrides, File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/models/archival.py", line 191, in load_archive cuda_device=cuda_device, File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/models/model.py", line 367, in load return model_class._load(config, serialization_dir, weights_file, cuda_device) File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/models/model.py", line 285, in _load vocab_dir, vocab_params.get("padding_token"), vocab_params.get("oov_token") File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/data/vocabulary.py", line 328, in from_files base_directory = cached_path(directory, extract_archive=True) File "/Users/ax/opt/anaconda3/envs/new_env/lib/python3.6/site-packages/allennlp/common/file_utils.py", line 175, in cached_path raise FileNotFoundError(f"file {url_or_filename} not found") FileNotFoundError: file /var/folders/15/gkjwslh904x25gqrgyy07ndw0000gn/T/tmp85q5ara4/vocabulary not found INFO - allennlp.models.archival - removing temporary unarchived model dir at /var/folders/15/gkjwslh904x25gqrgyy07ndw0000gn/T/tmp85q5ara4

Could someone please explain this error? Thank you so much!

dwadden commented 11 months ago

Hmm, I haven't seen this before. Can you check whether running one of the models in this repo (as opposed to a model associated with a different publication) causes a similar error? If you only get the error when running the radgraph model, I think the next thing to do is check in with the authors of that paper.