Do you fix weights of embeddings and attention blocks after loading pretrained checkpoints for finetuning, or is it just an initialization and they are further updated through finetuning?
I can't really find the answer in your code.
Hi Antoine,
Sorry for the late reply.
For fine-tuning, I do not freeze any parameters, I load the pretrained model and fine-tune it for very few epochs.
Hi Khaled,
Do you fix weights of embeddings and attention blocks after loading pretrained checkpoints for finetuning, or is it just an initialization and they are further updated through finetuning? I can't really find the answer in your code.
Many thanks.