Open sabhi27 opened 2 years ago
Hi, I think this problem is probably because t5-question-generator.pt
is just the saved model weights file. Loading a transformer model also requires several other files like the tokenizer_config.json and special_tokens.json.
I should probably update the training script to use model.save_pretrained()
instead of torch.save()
. Using the first method should generate all the necessary files to reload the model using T5ForConditionalGeneration.from_pretrained()
.
You could either:
model = T5ForConditionalGeneration.from_pretrained("iarfmoose/t5-base-question-generator")
state_dict = torch.load("t5-question-generator.pt")
model.load_state_dict(state_dict)
Thanks for your response @AMontgomerie . Can you please update the required changes to smoothen the flow?
Thanku for raising the issue @sabhi27
import torch from transformers import T5ForConditionalGeneration model = T5ForConditionalGeneration.from_pretrained("iarfmoose/t5-base-question-generator") state_dict = torch.load("/content/drive/MyDrive/question_BIOasq/new_type/t5-question-generator.pt") model.load_state_dict(state_dict)
when i am loading my trained weight with using below codes i am getting this error
RuntimeError: Error(s) in loading state_dict for T5ForConditionalGeneration: Missing key(s) in state_dict: "shared.weight", "encoder.embed_tokens.weight", "encoder.block.0.layer.0.SelfAttention.q.weight", "encoder.block.0.layer.0.SelfAttention.k.weight", "encoder.block.0.layer.0.SelfAttention.v.weight", "encoder.block.0.layer.0.SelfAttention.o.weight", "encoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight", "encoder.block.0.layer.0.layer_norm.weight", "encoder.block.0.layer.1.DenseReluDense.wi.weight", "encoder.block.0.layer.1.DenseReluDense.wo.weight", "encoder.block.0.layer.1.layer_norm.weight", "encoder.block.1.layer.0.SelfAttention.q.weight", "encoder.block.1.layer.0.SelfAttention.k.weight", "encoder.block.1.layer.0.SelfAttention.v.weight", "encoder.block.1.layer.0.SelfAttention.o.weight", "encoder.block.1.layer.0.layer_norm.weight", "encoder.block.1.layer.1.DenseReluDense.wi.weight", "encoder.block.1.layer.1.DenseReluDense.wo.weight", "encoder.block.1.layer.1.layer_norm.weight", "encoder.block.2.layer.0.SelfAttention.q.weight", "encoder.block.2.layer.0.SelfAttention.k.weight", "encoder.block.2.layer.0.SelfAttention.v.weight", "encoder.block.2.layer.0.SelfAttention.o.weight", "encoder.block.2.layer.0.layer_norm.weight", "encoder.block.2.layer.1.DenseReluDense.wi.weight", "encoder.block.2.layer.1.DenseReluDense.wo.weight", "encoder.block.2.layer.1.layer_norm.weight", "encoder.block.3.layer.0.SelfAttention.q.weight", "encoder.block.3.layer.0.SelfAttention.k.weight", "encoder.block.3.layer.0.SelfAttention.v.weight", "encoder.block... Unexpected key(s) in state_dict: "epoch", "model_state_dict", "optimizer_state_dict", "best_score".
can you help me to get out of this @AMontgomerie ..ty
Oh, looks like the save function I wrote also saves the optimizer state and some other variables. That's why it's complaining about unexpected keys, although I'm not sure why it's also complaining about missing keys...
Can you try this instead?
import torch
from transformers import T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained("iarfmoose/t5-base-question-generator")
state_dict = torch.load("/content/drive/MyDrive/question_BIOasq/new_type/t5-question-generator.pt")
model.load_state_dict(state_dict["model_state_dict"]) # <-- try changing this line
OK I've replace the old save function. The new one uses the Huggingface-style saving instead. Now when you train the model, it should create a directory called ./t5-base-question-generator
(or wherever your save_dir
points to) which has a bunch of files in.
Then you can load your saved model like:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("./t5-base-question-generator")
model = AutoModelForSeq2SeqLM.from_pretrained("./t5-base-question-generator")
Hi @AMontgomerie I have successfully trained model on my own dataset and one file "t5-question-generator.pt" got saved as model file in question-generator folder.
While inferencing, when I am doing qg = QuestionGeneration(), I am getting below error. OSError: Couldn't reach server at '/content/question_generator/t5-question-generator.pt' to download configuration file or configuration file is not a valid JSON file. Please check network or file content here: /content/question_generator/t5-question-generator.pt.
Can you help me get this resolved???