My understanding is that it does not need training when finetuning the last layer of the language model. In testing, we finetune the model from original weights on each sample. But in this way, the Reliability is about 10% and locality is about 100% when finetuning the self_attn, fc1, fc2 of the 31 layer of blip2. I want to know if there is misunderstanding. Thank you!
My understanding is that it does not need training when finetuning the last layer of the language model. In testing, we finetune the model from original weights on each sample. But in this way, the Reliability is about 10% and locality is about 100% when finetuning the self_attn, fc1, fc2 of the 31 layer of blip2. I want to know if there is misunderstanding. Thank you!