Hi! Thanks for your awesome work and I am trying to use your pretrained weights to train on another dataset.
However, the inputs of my data consist of two different parts and I need to do the attention operation before put them into the pretrained UniVL to finetune.
Could you please give me some suggestions on how to fine-tune the model with additional layers before the UniVL? It is like inputs -> additional attention module (random initialization) -> UniVL. I am confused about the training strategy since I have not done the pre-training work before.
It will be my great pleasure if you could reply to me :)
Best wishes.
@CrystalSixone, It is an interesting question, but I have no good idea that does not invalidate the pretrained weights. A direct method is to freeze the pretrained weight at the beginning. I find two related discussions on it and maybe useful for you. Link1 and Link2 .
Hi! Thanks for your awesome work and I am trying to use your pretrained weights to train on another dataset. However, the inputs of my data consist of two different parts and I need to do the attention operation before put them into the pretrained UniVL to finetune. Could you please give me some suggestions on how to fine-tune the model with additional layers before the UniVL? It is like inputs -> additional attention module (random initialization) -> UniVL. I am confused about the training strategy since I have not done the pre-training work before. It will be my great pleasure if you could reply to me :) Best wishes.