muzairkhattak / multimodal-prompt-learning

[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
https://muzairkhattak.github.io/multimodal-prompt-learning/
MIT License
635 stars 48 forks source link

MaPLe on a custom dataset, acc low #44

Closed goncayilmaz closed 9 months ago

goncayilmaz commented 10 months ago

Hello, I am trying out training maple for classification on a custom dataset. The code runs perfectly and the number of classes and instances seem to be correct. However, in the end, the cross entropy loss is still so high (~3.0) and the accuracy is super low (~2.0%). Do you know why it could that be?

muzairkhattak commented 10 months ago

Hi @goncayilmaz,

Thank you for showing interest in MaPLe!

Regarding your query, I think it is likely that your class names in the text embeddings are being mistakenly replaced by the learnable prompts, due to which there is no class information left in your text embeddings.

This could possibly occur if you have a learnable prompt vector set to a very high number.

Can you check if you are using the default number of prompts (which is 2).

Thank you and kind regards!

muzairkhattak commented 9 months ago

I'm closing this issue. Feel free to open it incase your issue is still not resolved.

Thank you!

goncayilmaz commented 9 months ago

Hello, thanks for your reply. I checked again and the n_ctx is set to 2 as by default. I am also checking that the classnames are mapped to the right folder so I do not have any idea why it could be so low. Without any adaptation, the accuracy should be around 10% with OpenAI CLIP ViT-L/14.

Do you have any other suggestions?

muzairkhattak commented 7 months ago

Hi @goncayilmaz, I just checked your message now.

Can you specify if you are still facing this issue? Thank you.

JunbongJang commented 7 months ago

You can try training it longer by increasing the max epoch number. I changed the MAX_EPOCH to 100 in configs/trainers/MaPLe/vit_b16_c2_ep5_batch4_2ctx.yaml to improve accuracy