ai-forever / ru-gpts

Russian GPT3 models.
Apache License 2.0
2.08k stars 445 forks source link

how to properly load a model and get a prediction for a specific text #71

Closed MuhammedTech closed 3 years ago

MuhammedTech commented 3 years ago

I completed the training and saved models like the following: model.save_pretrained("/content/drive/MyDrive/gpt_sentiment/model_rugpt3-trainer", push_to_hub=False) tokenizer.save_pretrained("/content/drive/MyDrive/gpt_sentiment/model_rugpt3-trainer", push_to_hub=False)

then I am loading the model:

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

device = torch.device('cuda')
model_name_or_path = "/content/drive/MyDrive/gpt_sentiment/model_rugpt3-trainer"
tokenizer = GPT2Tokenizer.from_pretrained(model_name_or_path,local_files_only=True)
model = GPT2LMHeadModel.from_pretrained(model_name_or_path, local_files_only=True)
model = model.to(device)

text = "Александр Сергеевич Пушкин родился в "
#input_ids = tokenizer.encode(text, return_tensors="pt").cuda()
input_ids = tokenizer(text, return_tensors="pt")['input_ids'].to(device)
out = model.generate(**input_ids,
                     max_length=1024, 
                     do_sample=True,
                     no_repeat_ngram_size=20,
                     pad_token_id = 50258)
generated = decode(out[0])
print(generated)

here i am getting:

TypeError                                 Traceback (most recent call last)
<ipython-input-58-c57a5d1ecfe7> in <module>()
     17                      do_sample=True,
     18                      no_repeat_ngram_size=20,
---> 19                      pad_token_id = 50258)
     20 generated = decode(out[0])
     21 print(generated)

TypeError: generate() argument after ** must be a mapping, not Tensor

how can i properly load it and get the prediction for my text

Artyrm commented 3 years ago

That looks strange to me:

input_ids = tokenizer(text, return_tensors="pt")['input_ids'].to(device)
out = model.generate(**input_ids,

Try maybe:

input_ids = tokenizer(text, return_tensors="pt").to(device)
out = model.generate(input_ids
king-menin commented 3 years ago

solved by https://github.com/sberbank-ai/ru-gpts/issues/71#issuecomment-894072321