Closed gabinguo closed 2 years ago
Very good question, our experience shows that especially for small datasets you should go for loading the full model with AdaptiveModel.load
If you QA dataset is decently large (5k+ QA pairs) you should also be fine with just initializing the model. We havent done test to see if initializing only the LM and not the prediction head helps in adjusting the model to out of domain data. Maybe you could try and report back here?
Thanks for the reply : )
Glad to know the effects for small datasets. Cuz I am trying to experiment with some small QA sets with ~2000 qa pairs.
We havent done test to see if initializing only the LM and not the prediction head helps in adjusting the model to out of domain data. Maybe you could try and report back here?
Sure, I will try to make one experiment about this.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 21 days if no further activity occurs.
Question
Hello, About the fine-tuning on the target QA task, do we need to re-instantiate the prediction head or not? For example, I have the model Roberta-base fine-tune on SQuAD (Roberta-base-SQuAD), and I need to fine-tune the model on my target QA task, for the prediction_head, should I load the head with SQuAD or just re-instantiate a new one?
Additional context
or