Closed iamlockelightning closed 3 years ago
Please help. @LysandreJik @sgugger
A model that is not inside the transformers
library won't work with the AutoModel API.
To properly use the save/from pretrained methods, why not subclassing PreTrainedModel
instead of nn.Module
?
Thanks for your reply! I will try.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
A model that is not inside the
transformers
library won't work with the AutoModel API. To properly use the save/from pretrained methods, why not subclassingPreTrainedModel
instead ofnn.Module
?
@sgugger Could you give an example on how to subclass PreTrainedModel? I would also like to integrate my model at https://huggingface.co/maxpe/twitter-roberta-base_semeval18_emodetection better with the transformer library:
def loss_fn(outputs, targets):
return torch.nn.BCEWithLogitsLoss()(outputs, targets)
class RobertaClass(torch.nn.Module):
def __init__(self):
super(RobertaClass, self).__init__()
self.l1 = AutoModel.from_pretrained("cardiffnlp/twitter-roberta-base",return_dict=False)
self.l2 = torch.nn.Dropout(0.3)
self.l3 = torch.nn.Linear(768, 11)
def forward(self, input_ids, attention_mask,labels):
_, output_1= self.l1(input_ids=input_ids, attention_mask=attention_mask)
output_2 = self.l2(output_1)
output = self.l3(output_2)
return (loss_fn(labels.float(),output),output)
model=RobertaClass()
model.train()
...
model=RobertaClass()
model.load_state_dict(torch.load(path))
model.eval()
My attempt with PyTorchModelHubMixin
didn't work well.
@iamlockelightning did you save the model properly??
Details
I am using the Trainer to train a custom model, like this:
When running
trainer.save_model()
, it will only save the model's state, as the custom model is not aPreTrainedModel
(as the terminal shown below).And when reloading the saved model on production, I need to initialize a new
MyModel
and load its states, which is not so convenient. I hope to load this model usingtransformers.AutoModel.from_pretrained('MODEL_PATH')
like otherPreTrainedModel
s.I tried to change
class MyModel(nn.Module)
toclass MyModel(PreTrainedModel)
, but thePreTrainedModel
needs aPretrainedConfig
when initialized. I don't have one in the current implementation, I don't know how to manage the config when using multiple PreTrainedModel models. I want to keep theself.bert_layer_1
andself.bert_layer_2
as simple asfrom_pretrained
, not= BertModel(config)
.Is there a way to do that?
Environment info
transformers
version: 4.9.2