Riccorl / transformer-srl

Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also predicate disambiguation.
69 stars 9 forks source link

Is there a pre-trained model that one can run? #2

Closed logicReasoner closed 4 years ago

logicReasoner commented 4 years ago

@Riccorl Thanks for your continued effort on SRL on top of BERT! I've managed to run the SRL model as described in https://demo.allennlp.org/semantic-role-labeling How can one run your model locally? Can it run on CPU only?

Riccorl commented 4 years ago

Hi!

Unfortunately, I don't have a pretrained model with PropBank inventory. You have to train it :(

However, train should be easy. It can run on CPU, yes, but it's really slow. You can use Colab to train it with a GPU, for free. To run it, you can clone this repo and run

export SRL_TRAIN_DATA_PATH="path/to/train"
export SRL_VALIDATION_DATA_PATH="path/to/development"
allennlp train training_config/bert_base_span.jsonnet -s path/to/model --include-package transformer_srl

where training_config/bert_base_span.jsonnet is the config file that I usually use.

logicReasoner commented 4 years ago

@Riccorl Thanks for your quick reply! I see. Can you provide some sample input / output pairs generated by this project so that I can see if the format is suitable for my needs?

BTW, where can I download the Prop Bank inventory that you mentioned?

Riccorl commented 4 years ago

The output is a dictionary that contains the following keys:

For instance, the sentence

The keys, which were needed to access the building, were locked in the car.

this will be output

"verb": needed # the predicate token
"description":   [ARG1: The keys] , [R-ARG1: which] were [V: needed] [ARGM-PRP: to access the building] , were locked in the car . # the sentence with the predicate and args annotaded, 
"tags": [B-ARG1, I-ARG1, O, O ...] # list of args,
"frame": need.01 # the predicate label

This is the piece of code about the output.

BTW, where can I download the Prop Bank inventory that you mentioned?

You need the conll 2012 dataset that is not free, so I cannot distribute it. You can find more information here.

logicReasoner commented 4 years ago

Obtaining conll 2012 is unfortunately outside my financial capabilities... ☹️

BTW if you train a model on colab for free using conll 2012, you can then distribute the trained model without violating the conll license. This is how Allen NLP do it.

On Mon, Sep 14, 2020, 17:23 Riccardo Orlando notifications@github.com wrote:

The output is a dictionary that contains the following keys:

For instance, the sentence

The keys, which were needed to access the building, were locked in the car.

this will be output

"verb": needed # the predicate token "description": [ARG1: The keys] , [R-ARG1: which] were [V: needed] [ARGM-PRP: to access the building] , were locked in the car . # the sentence with the predicate and args annotaded, "tags": [B-ARG1, I-ARG1, O, O ...]# list of args, "frame": need.01 # the predicate label

BTW, where can I download the Prop Bank inventory that you mentioned?

You need the conll 2012 dataset that is not free, so I cannot distribute it. You can find more information here https://conll.cemantix.org/2012/data.html.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Riccorl/transformer-srl/issues/2#issuecomment-692086098, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQBYKS25DRS22BHMRGMGJRDSFYRM7ANCNFSM4RKY5YKA .

Riccorl commented 4 years ago

Yeah, I will try to train it asap and publish it :)

logicReasoner commented 4 years ago

@Riccorl you're awesome! :)

Riccorl commented 4 years ago

I uploaded a model here. It's based on BERT base (not large), the F1 scores are 86 and 95.5, respectively for argument and predicate disambiguation/identification (on the dev set).

logicReasoner commented 4 years ago

@Riccorl so I eagerly tried your model srl_bert_base_conll2012.tar.gz as a drop-in replacement to allenNlp's bert-base-srl-2020.03.24.tar.gz.

from allennlp.predictors.predictor import Predictor
predictor = Predictor.from_path("/home/user/srl_bert_base_conll2012.tar.gz")
predictor.predict(
  sentence="Did Uriah honestly think he could beat the game in under three hours?"
)

but I got the following error while loading it as a predictor in Python 3.6:

I0917 06:58:13.354965 139747665430336 archival.py:164] loading archive file ~/srl_bert_base_conll2012.tar.gz from cache at /home/user/srl_bert_base_conll2012.tar.gz
2020-09-17 06:58:13 INFO     allennlp.models.archival  - loading archive file ~/srl_bert_base_conll2012.tar.gz from cache at /home/user/srl_bert_base_conll2012.tar.gz
I0917 06:58:13.356019 139747665430336 archival.py:171] extracting archive file /home/user/srl_bert_base_conll2012.tar.gz to temp dir /tmp/tmpcmoyc42s
2020-09-17 06:58:13 INFO     allennlp.models.archival  - extracting archive file /home/user/srl_bert_base_conll2012.tar.gz to temp dir /tmp/tmpcmoyc42s
Traceback (most recent call last):
  File "/home/user/.local/bin/project", line 10, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.6/site-packages/project/__main__.py", line 76, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/home/user/.local/lib/python3.6/site-packages/project/cli/run.py", line 88, in run
    project.run(**vars(args))
  File "/home/user/.local/lib/python3.6/site-packages/project/run.py", line 33, in run
    import project.core.run
  File "/home/user/.local/lib/python3.6/site-packages/project/core/run.py", line 22, in <module>
    from project.server import add_root_route
  File "/home/user/.local/lib/python3.6/site-packages/project/server.py", line 70, in <module>
    srlPredictor = Predictor.from_path("~/srl_bert_base_conll2012.tar.gz")
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/predictors/predictor.py", line 275, in from_path
    load_archive(archive_path, cuda_device=cuda_device),
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/models/archival.py", line 197, in load_archive
    opt_level=opt_level,
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/models/model.py", line 391, in load
    model_class: Type[Model] = cls.by_name(model_type)  # type: ignore
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/common/registrable.py", line 137, in by_name
    subclass, constructor = cls.resolve_class_name(name)
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/common/registrable.py", line 185, in resolve_class_name
    f"{name} is not a registered name for {cls.__name__}. "
allennlp.common.checks.ConfigurationError: transformer_srl_span is not a registered name for Model. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.
I0917 06:58:16.335596 139747665430336 archival.py:205] removing temporary unarchived model dir at /tmp/tmpcmoyc42s
2020-09-17 06:58:16 INFO     allennlp.models.archival  - removing temporary unarchived model dir at /tmp/tmpcmoyc42s

It has something to do with transformer_srl_span is not a registered name for Model so I guess I am missing some specific configuration setting?

Riccorl commented 4 years ago

You should import models, dataset_readers and predictors from transformer_srl even if you don't explicitly use them. It's equivalent to add --include-package from cli.

logicReasoner commented 4 years ago

Thanks to your tip, I've managed to resolve that particular error. Now, python cannot find the bert-base-cased model due to the firewall configuration which does not allow remote connections. I've manually downloaded bert-base-cased-pytorch_model.bin and bert-base-cased-pytorch_model.json and managed to run the model. There are some warnings about some missing bert-base-cased related files:

bert-base-cased/added_tokens.json. We won't load it.
2020-09-17 12:26:49 INFO     transformers.tokenization_utils  - Didn't find file /home/user/bert-base-cased/added_tokens.json. We won't load it.
I0917 12:26:49.964895 139946394539840 tokenization_utils.py:965] Didn't find file /home/user/bert-base-cased/special_tokens_map.json. We won't load it.
2020-09-17 12:26:49 INFO     transformers.tokenization_utils  - Didn't find file /home/user/bert-base-cased/special_tokens_map.json. We won't load it.
I0917 12:26:49.964944 139946394539840 tokenization_utils.py:965] Didn't find file /home/user/bert-base-cased/tokenizer_config.json. We won't load it.
2020-09-17 12:26:49 INFO     transformers.tokenization_utils  - Didn't find file /home/user/bert-base-cased/tokenizer_config.json

are those needed?

Riccorl commented 4 years ago

I don't know for sure, because the model uses default configs from Huggingface. I guess it doesn't matter (?) since they are only warnings.

logicReasoner commented 4 years ago

@Riccorl After having played with the model for a bit, it seems to do a really decent job. So what are the next steps in raising the bar in accuracy even higher? Would using bert-large-cased make a significant difference or perhaps switching to another one of the HuggingFace models?

Thanks and keep up the good work!

Riccorl commented 4 years ago

bert-large-cased can indeed improve the results, but the improvements are marginal (as you can see in the paper here on page 5). I guess that switching model has a better chance to improve the results. As for now, only models that accept 1 as token_type_id works, I tried to generalize more, but it didn't work. I hope I can make it work really soon because I really much need it 🤣

logicReasoner commented 4 years ago

Don't worry! You'll make it work eventually! We have faith in you ;)

OanaIgnat commented 3 years ago

Hello, I tried the ideas from above and got the following errors. Do you know why they might be and how to solve them? Thank you so much!

from transformer_srl import dataset_readers, models, predictors from allennlp.predictors.predictor import Predictor predictor = Predictor.from_path("data/srl_bert_base_conll2012.tar.gz") predictor.predict( sentence="Did Uriah honestly think he could beat the game in under three hours?" )

File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/archival.py", line 208, in load_archive model = _load_model(config.duplicate(), weights_path, serialization_dir, cuda_device) File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/archival.py", line 246, in _load_model cuda_device=cuda_device, File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/model.py", line 406, in load return model_class._load(config, serialization_dir, weights_file, cuda_device) File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/model.py", line 326, in _load missing_keys, unexpected_keys = model.load_state_dict(model_state, strict=False) File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for TransformerSrlSpan: size mismatch for frame_projection_layer.weight: copying a param with shape torch.Size([5497, 768]) from checkpoint, the shape in current model is torch.Size([5929, 768]). size mismatch for frame_projection_layer.bias: copying a param with shape torch.Size([5497]) from checkpoint, the shape in current model is torch.Size([5929]).

Riccorl commented 3 years ago

Yeah, there is a piece of code in 2.4 that breaks that model. If you try with pip install transformer-srl==2.3.1 it should work. Let me know!

Riccorl commented 3 years ago

@OanaIgnat I uploaded a new version of the pretrained model compatible with 2.4.4.. It should fix your problem. Let me know!