Is T5 also learned during Prompt Tuning?

jeewoo1025 commented 1 year ago

Hello, Thank you for the good work.

Is T5 also learned during Prompt Tuning? When I check the learnable layers, I found that not only the prompt but also the T5 model is learned. I am confused about whether PLM is frozen or not during training because I believe the previous prompt tuning (https://arxiv.org/abs/2104.08691) is that PLM is frozen and only the soft prompt is unfrozen.

▶ Command

gpus=0
BACKBONE=T5
DATASET=SST2

CUDA_VISIBLE_DEVICES=$gpus python3 train.py --config config/${DATASET}Prompt${BACKBONE}.config --gpu $gpus

▶ Add Code lines I add below codes on line 159 Prompt-Transferability-1.0/tools/train_tool.py (link)

    # Check encoder-decoder requires_grad T/F
    for name, param in model.named_parameters():
        print(name, param.requires_grad)

▶ Results

230726

Sincerely, Jeewoo Sul

yushengsu-thu commented 1 year ago

@jeewoo1025

Hello Jeewoo, In line 197, you can find the model.zero_grad(). So, the parameters of T-5 will not be optimized.

You can also try to directly re-used the provided T5's prompts on the corresponding tasks on the T5 backbone. I believe these prompts can work (if I train the T5's parameters, these prompts should not work on T5).

Besides, I recommend that you can use Prompt_transferability-2.0. We re-factor the original experimental code and make it more readable.

Thanks

jeewoo1025 commented 1 year ago

Thank you for your kind answer.

yushengsu-thu commented 1 year ago

@jeewoo1025

No problem. Feel free to mail me if you have any further question :D

thunlp / Prompt-Transferability

Is T5 also learned during Prompt Tuning? #14