Closed SpeeeedLee closed 3 weeks ago
In your code, in case of merging BERT, Roberta, etc., It seems that the classification heads were not merged, but separately keep and add back to the merged checkpoints for each task. Am I correct? Thanks!
Yes. Your understanding is right.
We do not merge the classification heads for BERT and Roberta since different tasks may have different numbers of classes.
In your code, in case of merging BERT, Roberta, etc., It seems that the classification heads were not merged, but separately keep and add back to the merged checkpoints for each task. Am I correct? Thanks!