htyao89 / Textual-based_Class-aware_prompt_tuning

MIT License
13 stars 1 forks source link

About the selection of Prompt length M #1

Open Adong17 opened 2 months ago

Adong17 commented 2 months ago

Thanks for your great work.But i have a question about the selection of prompt lenght M.In the paper about the "Effect of Prompt length M", M=8 is the best ,hyperparameters.But i discover M in the code has been set to 4, i want to know the effect about M(The output of TKD part).

Snipaste_2024-05-08_11-26-19

htyao89 commented 2 months ago

Thanks for your great work.But i have a question about the selection of prompt lenght M.In the paper about the "Effect of Prompt length M", M=8 is the best ,hyperparameters.But i discover M in the code has been set to 4, i want to know the effect about M(The output of TKD part).

Snipaste_2024-05-08_11-26-19

Firstly, most of the existing prompt tuning methods set the prompt length as 4. Therefore, setting M=4 is a fair comparison with existing methods.

Secondly, existing methods have shown that using a longer prompt length can obtain a higher performance. In our TCP, although setting M=8 obtains the highest performance of 79.63%, it has a small performance gap with the 79.51% obtained by setting M=4/16. Therefore, the proposed TCP is relatively insensitive to the prompt length, proving that using the class-aware prompt to capture the prior textual-level class knowledge is superior to choosing the prompt length (M=1/2/4/8/16, H=79.26/79.3/79.51/79.63/79.51).

Finally, all results reported in the paper are obtained by setting the prompt length as 4.

Adong17 commented 2 months ago

So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?

htyao89 commented 2 months ago

So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?

For the prompt length ablation experiment, the output of the meta-net is the same as the prompt length rather than set to 4 all the time. Note that, the value of output of the meta-net and the prompt length are the same.

htyao89 commented 2 months ago

So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?

The standard formulation of Meta-Net is:

    self.meta_net = nn.Sequential(
        OrderedDict([("linear1", nn.Linear(vis_dim, vis_dim // 4,bias=True)),
                     ("relu", QuickGELU()),
                     ("linear2", nn.Linear(vis_dim // 4, n_ctx*ctx_dim,bias=True))
                     ]))
Adong17 commented 2 months ago

Thanks for your reply.