muzairkhattak / multimodal-prompt-learning

[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
https://muzairkhattak.github.io/multimodal-prompt-learning/
MIT License
619 stars 43 forks source link

Hi, how do you set the scale range for each dimension in the radar chart to be different? #20

Closed jingzhengli closed 1 year ago

jingzhengli commented 1 year ago

Hi, nice code and picture! how do you set the scale range for each dimension in the radar chart to be different?

muzairkhattak commented 1 year ago

Hi @jingzhengli!

Thank you for showing interest in MaPLe.

Regarding your query, please note that:

We hope your query is resolved now.

Thank you.

jingzhengli commented 1 year ago

Thanks for your quick and detailed reply. It would completely solve my confusion. I have spent one day to draw such radar chart. Thanks again~

muzairkhattak commented 1 year ago

No problem. Best of luck for your work!

jingzhengli commented 1 year ago

No problem. Best of luck for your work!

Hi, I compared the time consumption of some methods. I found that the running time in forward pass and backward propagation in MaPLe (0.281s) is lower than CoOp (0.558s) and CoCoOp. It is very interesting and I wonder if there is something wrong with my experiment? I guess the reason is that MaPLe only updates the top-level and middle-level parameters, while CoOp updates the bottom-level parameters such that it consumes more time.

muzairkhattak commented 1 year ago

Hi @jingzhengli,

In comparison with CoCoOp, both MaPLe and CoOp are much faster during training because of the feedback image conditional signal in CoCoOp.

But in comparison between MaPLe and CoOp, I believe MaPLe should take a bit more time during backward pass as it updates the prompts in both vision and language branches of CLIP while CoOp only does it for the language branch.

So, I think we can verify this experiment again which you are performing. Can you take a look into the following items:

I hope this can be helpful.

Thank you and kind regards

jingzhengli commented 1 year ago

Hi @jingzhengli,

In comparison with CoCoOp, both MaPLe and CoOp are much faster during training because of the feedback image conditional signal in CoCoOp.

But in comparison between MaPLe and CoOp, I believe MaPLe should take a bit more time during backward pass as it updates the prompts in both vision and language branches of CLIP while CoOp only does it for the language branch.

So, I think we can verify this experiment again which you are performing. Can you take a look into the following items:

  • Are you doing comparison with keeping the batch size same? It should be noted that the default batch size of MaPLe is 4 while for CoOp, it is 32.
  • Are you calculating time for only one single forward and backward pass, or are you running multiple runs and then taking the average? Some times, there can be random delays. So you can try running a for loop for to undergo multiple forward and backward passes and then take the average time. It might reduce the effect of the random delays that could possibly occur.

I hope this can be helpful.

Thank you and kind regards

Hi @jingzhengli,

In comparison with CoCoOp, both MaPLe and CoOp are much faster during training because of the feedback image conditional signal in CoCoOp.

But in comparison between MaPLe and CoOp, I believe MaPLe should take a bit more time during backward pass as it updates the prompts in both vision and language branches of CLIP while CoOp only does it for the language branch.

So, I think we can verify this experiment again which you are performing. Can you take a look into the following items:

  • Are you doing comparison with keeping the batch size same? It should be noted that the default batch size of MaPLe is 4 while for CoOp, it is 32.
  • Are you calculating time for only one single forward and backward pass, or are you running multiple runs and then taking the average? Some times, there can be random delays. So you can try running a for loop for to undergo multiple forward and backward passes and then take the average time. It might reduce the effect of the random delays that could possibly occur.

I hope this can be helpful.

Thank you and kind regards

Sorry for the late reply, due to the experiment. Thanks for your insightful perspective. You are right: MaPLe takes a bit more time than CoOp.

I additionally have some new findings. Despite the fact that the prompt-based methods (i.e., CoOp, CoCoOp) require only a small number of trainable parameter to be updated, they consume more time than other fine-tuning methods (i.e., Full fine-tune, linear-probing). In other words, Fewer trainable parameters can reduce the training memory, but not necessarily improve time efficiency. I don't know if this is the correct view. Thanks for your kind consideration.