huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.25k stars 26.84k forks source link

how to save and load fine-tuned model? #7849

Closed wmathor closed 4 years ago

wmathor commented 4 years ago

❓ Questions & Help

Details

class MyModel(nn.Module):
  def __init__(self, num_classes):
    super(MyModel, self).__init__()
    self.bert = BertModel.from_pretrained('hfl/chinese-roberta-wwm-ext', return_dict=True).to(device)
    self.fc = nn.Linear(768, num_classes, bias=False)

  def forward(self, x_input_ids, x_type_ids, attn_mask):
    outputs = self.bert(x_input_ids, token_type_ids=x_type_ids, attention_mask=attn_mask)
    pred = self.fc(outputs.pooler_output)
    return pred

model = MyModel(num_classes).to(device)
# save 

# load

I have defined my model via huggingface, but I don't know how to save and load the model, hopefully someone can help me out, thanks!

NielsRogge commented 4 years ago

To save your model, first create a directory in which everything will be saved. In Python, you can do this as follows:

import os
os.makedirs("path/to/awesome-name-you-picked")

Next, you can use the model.save_pretrained("path/to/awesome-name-you-picked") method. This will save the model, with its weights and configuration, to the directory you specify. Next, you can load it back using model = .from_pretrained("path/to/awesome-name-you-picked").

Source: https://huggingface.co/transformers/model_sharing.html

wmathor commented 4 years ago

To save your model, first create a directory in which everything will be saved. In Python, you can do this as follows:

import os
os.makedirs("path/to/awesome-name-you-picked")

Next, you can use the model.save_pretrained("path/to/awesome-name-you-picked") method. This will save the model, with its weights and configuration, to the directory you specify. Next, you can load it back using model = .from_pretrained("path/to/awesome-name-you-picked").

Source: https://huggingface.co/transformers/model_sharing.html

Should I save the model parameters separately, save the BERT first and then save my own nn.linear. Is this the only way to do the above? Is there an easy way? Thank you for your reply

wmathor commented 4 years ago

I validate the model as I train it, and save the model with the highest scores on the validation set using torch.save(model.state_dict(), output_model_file). As shown in the figure below

Then I trained again and loaded the previously saved model instead of training from scratch, but it didn't work well, which made me feel like it wasn't saved or loaded successfully ?

fakerbrother commented 4 years ago

Hi, I'm also confused about this. Have you solved this probelm? If yes, could you please show me your code of saving and loading model in detail. THX ! :)

wmathor commented 4 years ago

Hi, I'm also confused about this. Have you solved this probelm? If yes, could you please show me your code of saving and loading model in detail. THX ! :)

are you chinese? if you are, i could reply you by chinese

fakerbrother commented 4 years ago

Hi, I'm also confused about this. Have you solved this probelm? If yes, could you please show me your code of saving and loading model in detail. THX ! :)

are you chinese? if you are, i could reply you by chinese

哈哈哈,想请问一下你,该怎么保存模型。

wmathor commented 4 years ago

我问了一位台湾友人,他跟我说,huggingface的预训练模型也是torch写的,所以直接使用torch的方式正常加载和保存模型就行了

model = MyModel(num_classes).to(device)
optimizer = AdamW(model.parameters(), lr=2e-5, weight_decay=1e-2)
output_model = './models/model_xlnet_mid.pth'

# save
def save(model, optimizer):
    # save
    torch.save({
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict()
    }, output_model)

save(model, optimizer)

# load
checkpoint = torch.load(output_model, map_location='cpu')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
fakerbrother commented 4 years ago

我问了一位台湾友人,他跟我说,huggingface的预训练模型也是torch写的,所以直接使用torch的方式正常加载和保存模型就行了

model = MyModel(num_classes).to(device)
optimizer = AdamW(model.parameters(), lr=2e-5, weight_decay=1e-2)
output_model = './models/model_xlnet_mid.pth'

# save
def save(model, optimizer):
    # save
    torch.save({
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict()
    }, output_model)

save(model, optimizer)

# load
checkpoint = torch.load(output_model, map_location='cpu')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])

哦哦,好的,谢谢了!

cedar33 commented 3 years ago

我问了一位台湾友人,他跟我说,huggingface的预训练模型也是torch写的,所以直接使用torch的方式正常加载和保存模型就行了

model = MyModel(num_classes).to(device)
optimizer = AdamW(model.parameters(), lr=2e-5, weight_decay=1e-2)
output_model = './models/model_xlnet_mid.pth'

# save
def save(model, optimizer):
    # save
    torch.save({
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict()
    }, output_model)

save(model, optimizer)

# load
checkpoint = torch.load(output_model, map_location='cpu')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])

马克一下

GregRoq commented 3 years ago

Hi all,

I have saved a keras fine tuned model on my machine, but I would like to use it in an app to deploy.

I loaded the model on github, I wondered if I could load it from the directory it is in github?

That does not seem to be possible, does anyone know where I could save this model for anyone to use it? Huggingface provides a hub which is very useful to do that but this is not a huggingface model.

Let me know if you can help please :)

NielsRogge commented 3 years ago

I know the huggingface_hub library provides a utility class called ModelHubMixin to save and load any PyTorch model from the hub (see original tweet). I wonder whether something similar exists for Keras models?

cc @julien-c

GregRoq commented 3 years ago

That would be ideal. But I wonder; if there are no public hubs I can host this keras model on, does this mean that no trained keras models can be publicly deployed on an app?

julien-c commented 3 years ago

^Tagging @osanseviero and @nateraw on this!

osanseviero commented 3 years ago

Having an easy way to save and load Keras models is in our short-term roadmap and we expect to have updates soon!

if there are no public hubs I can host this keras model on, does this mean that no trained keras models can be publicly deployed on an app?

I'm not sure I fully understand your question. Using Hugging Face Inference API, you can make inference with Keras models and easily share the models with the rest of the community. Note that you can also share the model using the Hub and use other hosting alternatives or even run your model on-device.

GregRoq commented 3 years ago

Thanks @osanseviero for your reply! What i'm wondering is whether i can have my keras model loaded on the huggingface hub (or another) like I have for my BertForSequenceClassification fine tuned model (see the screeshot)?

This allows to deploy the model publicly since anyone can load it from any machine. I would like to do the same with my Keras model. Does that make sense? If yes, do you know how? That would be awesome since my model performs greatly! it's for a summariser:)

Screenshot 2021-07-27 at 11 38 08

Zhongli2002 commented 1 year ago

哈喽,可以保存整个模型而不是参数模型吗。

Nikita-Shrma commented 1 year ago
model = MyModel(num_classes).to(device)
optimizer = AdamW(model.parameters(), lr=2e-5, weight_decay=1e-2)
output_model = './models/model_xlnet_mid.pth'

# save
def save(model, optimizer):
    # save
    torch.save({
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict()
    }, output_model)

save(model, optimizer)

# load
checkpoint = torch.load(output_model, map_location='cpu')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])

hey, what is output_model parameter?? what should be it's value??