Open misaki-taro opened 1 year ago
As described, since the model is fixed, you can directly use the given parameters (expect for the final layer) as the pre-trained model.
Best, Jiayu
If you do not know how to extract the parameters, you can directly train on a new model without pertaining. The results is almost the same as the increase in the epoch
Thank you for your advice. I will give it a try at a later time.
Hi Jiayu,
I would like to express my appreciation for providing Phavip. However, I have encountered some confusion while working with it. In your documentation, you mention the following: "Thus, we first apply an end-to-end method to train the binary classification model. Then, we fix the parameters in the Transformer encoder and fine-tune a new classifier layer for the multi-class classification model. Binary cross-entropy (BCE) loss and L2 loss are employed for the binary classification and multi-class classification, respectively."
However, when I attempted to retrain the model, I couldn't find the fine-tune mode as described. Could you please assist me in reproducing the results using your code?
Best regards, Misaki