muzairkhattak / ProText

[CVPRW 2024] Official repository of paper titled "Learning to Prompt with Text Only Supervision for Vision-Language Models".
https://muzairkhattak.github.io/ProText/
MIT License
86 stars 4 forks source link

a question about contextual Mapping #6

Closed TsingpekTao closed 4 months ago

TsingpekTao commented 4 months ago

Dear Muhammad,

I've noticed how patiently you answer questions from many people, and it makes you seem very kind and patient. This has encouraged me to ask my own questions.

  1. Does your modified Dassl library still support MaPLe and promptSRC? If I modify Dassl, will it affect the training of MaPLe?
  2. In the protext.py file, I did not observe any handling of contextual mapping. Is this part of the processing done within Dassl?

If you could answer my questions, it would greatly help me in understanding the principles and code of the model more deeply. Thank you!

Wishing you a pleasant day!

Best regards,

muzairkhattak commented 4 months ago

Hi @TsingpekTao,

Thank you for showing interest in ProText!

Regarding your questions, kindly refer to the below answers:

  1. Does your modified Dassl library still support MaPLe and promptSRC? If I modify Dassl, will it affect the training of MaPLe?

Yes the modified Dassl library still support the full functionality of MaPLe and PromptSRC. You can directly use this repository for training these image-based variants. We have modified the Dassl library in a way, that will process text dataset dataloaders on for the ProText as shown below. https://github.com/muzairkhattak/ProText/blob/c779baa95d63617e0006464a6fb94fa3fa13a217/Dassl.pytorch/dassl/data/data_manager.py#L44

For trainers other than ProText (Like MaPLe, PromptSRC, etc,.), the Dassl code will run as normal with image-based trainings.

  1. In the protext.py file, I did not observe any handling of contextual mapping. Is this part of the processing done within Dassl?

Sorry for the confusion, it is implemented in the protext.py file. you can find the contextual mapping loss function at this line: https://github.com/muzairkhattak/ProText/blob/c779baa95d63617e0006464a6fb94fa3fa13a217/trainers/protext.py#L277

Here, the contextual mapping aims to embed/inject the rich LLM based contextual information into the prompted embeddings with the help of L2 loss.

I hope that is clear now.

Feel free to ask if you have any further questions.

Thank you and kind regards!

Sincerely, Muhammad Uzair

TsingpekTao commented 4 months ago

Thank you very much for your response. I recently heard in the news that the weather in Pakistan has been very hot. Please take care of yourself and wish you and your family stay healthy. Once again, thank you for your answer!

muzairkhattak commented 4 months ago

Hi @TsingpekTao,

Thank you very much for your kind words! I wish you all the best in your research!