studio-ousia / luke

LUKE -- Language Understanding with Knowledge-based Embeddings
Apache License 2.0
705 stars 102 forks source link

Convert .th model #152

Closed luffycodes closed 2 years ago

luffycodes commented 2 years ago

I ran the following command allennlp train examples/ner/configs/transformers_luke_with_entity_aware_attention.jsonnet -s results/ner/luke-large --include-package examples -o '{"trainer.cuda_device": 0, "trainer.use_amp": true}' from training script of ner Link.

However, the model weights get saved in the file weights.th.

How do we convert it to a format that can be loaded using hugging face pipeline.

Thanks,

ryokan0123 commented 2 years ago

As I think a lot of people would want to use trained models with the hugging face pipeline, I have added the conversion scripts for NER and Relation Classification. Pull the latest master branch and try the following command.

python examples/relation_classification/convert_allennlp_to_huggingface_model.py results/ner/luke-large SAVE-DIR

If you find any issues with this, please let me know!

luffycodes commented 2 years ago

Thanks a lot, it works for the Luke models.

But, I get the following error if my starting model is distilroberta-base. ValueError: Only models that use TransformersLukeEmbedder (registered as transformers-luke) can be converted

Command executed: # you can also fine-tune models from the BERT family export TRANSFORMERS_MODEL_NAME="distilroberta-base"; allennlp train examples/ner/configs/transformers.jsonnet -s results/ner/roberta-base --include-package examples from link.

ryokan0123 commented 2 years ago

Ah, sorry the current script does not support conversion of other models than LUKE and supporting them would require non-trivial amount of work.

The problem is that different pretrained models in HuggingFace usually require different downstream models even for a single task (e.g., BertForTokenClassification, RobertaForTokenClassification). So if we want to use distilroberta-base for relation classification, we need to implement something like RoBertaForTokenPairClassification in the HuggingFace style.

I think there are two options.

  1. use the allennlp pipeline for trained models.
  2. implement RoBertaForTokenPairClassification and convert the trained model to it.
luffycodes commented 2 years ago

okay thanks for the detailed solution. helps a lot 👍