Closed HanGuo97 closed 3 years ago
Hey @HanGuo97,
We try to keep the GitHub issues for bug reports. Do you mind asking your question on the forum instead? Also there might already be similar questions on the forum, such as https://discuss.huggingface.co/t/create-a-custom-model-that-works-with-any-pretrained-transformer-body/4186. Thanks!
Got it, thanks for letting me know!
Environment info
transformers
version: NAWho can help
@patrickvonplaten
Information
Model I am using (Bert, XLNet ...):
The problem arises when using:
The tasks I am working on is:
To reproduce
Thanks for the amazing library!
I'm curious if there are instructions on creating a
PreTrainedModel
subclass or creating annn.Module
that behaves like aPreTrainedModel
? Suppose I want to wrap the existing model with some simple additional capabilities inside annn.Module
, what are some of the methods that I need to implement/override -- so that they can work well with existing examples?I'm aware of some tutorials on creating a new model, but that seems pretty complicated and involved -- whereas I'm interested in just adding a couple of simple features.
For example, in the Seq2Seq example, I have noticed that the function signature of
model.forward
determines what data will (not) be passed to the model (as intrainer._remove_unused_columns
), and the existence ofmodel.prepare_decoder_input_ids_from_labels
also influences the input data (as inDataCollatorForSeq2Seq .__call__
).It'd be great if someone could point me to some guidance on tweaking the model to be compatible with the rest of the codebase. Thanks in advance for your time!
Steps to reproduce the behavior:
1. 2. 3.
Expected behavior