BERT model I trained doesn't load

TheArowanaDude commented 5 years ago

OS: Linux
Python version: 3.6.9
AllenNLP version v0.8.4

Hi, I successfully trained my own bert model but when I tried loading the model via python interface and I got this error:

Model name (my directory) was not found in model name list (bert-base-uncased, bert- > large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, >bert- base-multilingual-cased, bert-base-chinese). We assumed (my directory) was a path or url but couldn't find any file associated to this path or url.

I managed to successfully train the model, I just don't understand why it's failing to load now. Would greatly appreciate guidance and help on this!

kernelmachine commented 5 years ago

Just to clarify, you'd like to use a fine-tuned BERT model (fine-tuned via huggingface's pytorch-transformers) in allennlp's BERT token embedder? If this is the case, providing the directory where the BERT model was serialized as the model name should work. Can you provide the snippet of your BERT token embedder/indexer configuration here?

TheArowanaDude commented 5 years ago

Just to clarify, you'd like to use a fine-tuned BERT model (fine-tuned via huggingface's pytorch-transformers) in allennlp's BERT token embedder? If this is the case, providing the directory where the BERT model was serialized as the model name should work. Can you provide the snippet of your BERT token embedder/indexer configuration here?

Yes! I followed this template https://gist.github.com/joelgrus/7cdb8fb2d81483a8d9ca121d9c617514

"token_indexers": { "bert": { "type": "bert-pretrained", "pretrained_model": "bert-large-cased-vocab.txt", "do_lowercase": false, "use_starting_offsets": true },

"token_embedders": { "bert": { "type": "bert-pretrained", "pretrained_model": "wwm_cased_L-24_H-1024_A-16" },

kernelmachine commented 5 years ago

The values to both "pretrained_model" keys should be the absolute path to the serialization directory of the fine-tuned BERT model (or one of the default BERT model names).

TheArowanaDude commented 5 years ago

The values to both "pretrained_model" keys should be the absolute path to the serialization directory of the fine-tuned BERT model (or one of the default BERT model names).

Ah okay, should I edit the config file in serialized model? I tried to unzip and modify the config.json and re-zip but it gave me another error:

FileNotFoundError: file /tmp/tmpj8o6jszw/config.json not found

kernelmachine commented 5 years ago

What does your serialization directory contents look like? It should contain unzipped config.json, vocab.txt, and pytorch_model.bin files, you shouldn't need to edit the config file.

TheArowanaDude commented 5 years ago

What does your serialization directory contents look like? It should contain unzipped config.json, vocab.txt, and pytorch_model.bin files, you shouldn't need to edit the config file.

It contains all those files, it contains bert_config.json, vocab.txt, and pytorch_model.bin; which confuses me as I was able to train successfully.

DeNeutoy commented 5 years ago

please provide a full stack trace of the error you are getting (as requested in the issue template).

Nicozwy commented 5 years ago

I assumed it was a path or url problem. You can unzip the file and edit the configuration file to make sure that your path is correct. It works for me.

adityathiru commented 5 years ago

I have a solution for this issue which worked for me: If you have to unzip the archive, edit the config file and rezip it using tar -zcvf it failed for me too.

Instead of zipping using tar -zcvf, do the following:

from allennlp.models.archival import archive_model
archive_model(serialization_dir='path/to/model_tar_dir', archive_path='path/to/model.tar.gz', weights='name_of_your_weights_file.th')

This helps load the archive without any errors like FileNotFoundError: file /tmp/tmpj8o6jszw/config.json not found

Hope this helps!

DeNeutoy commented 4 years ago

Closing due to inactivity

lengocloi1805 commented 2 years ago

Sorry for mentioning when the topic is closed, but I'm having trouble interpreting the model. I have a trained model for NER task. I have used "xlm-roberta-base", and I have got file weight model.pt. I want to interpret my model using AllenNLP, I can see on guideline as below, but I have no idea about archive.

from allennlp.interpret.saliency_interpreters import SimpleGradient from allennlp.predictors import Predictor inputs = {"sentence": "a very well-made, funny and entertaining picture."} archive = ( "https://storage.googleapis.com/allennlp-public-models/basic_stanford_sentiment_treebank-2020.06.09.tar.gz" ) predictor = Predictor.from_path(archive) interpreter = SimpleGradient(predictor) interpretation = interpreter.saliency_interpret_from_json(inputs) print(interpretation)

Looking forward to everyone's help, thanks in advance!

allenai / allennlp

BERT model I trained doesn't load #3129