salesforce / ALBEF

Code for ALBEF: a new vision-language pre-training method
BSD 3-Clause "New" or "Revised" License
1.57k stars 199 forks source link

could you add a self-attention to enhance the effect? #109

Open klodlee opened 2 years ago

klodlee commented 2 years ago

Hello, thank you very much for your efforts, could you add a self-attention modal between multimodal encoder and contrastive loss to enhance the effect?