salesforce / ALBEF

Code for ALBEF: a new vision-language pre-training method
BSD 3-Clause "New" or "Revised" License
1.45k stars 193 forks source link

could you add a self-attention to enhance the effect? #109

Open klodlee opened 1 year ago

klodlee commented 1 year ago

Hello, thank you very much for your efforts, could you add a self-attention modal between multimodal encoder and contrastive loss to enhance the effect?