PhoebusSi / SAR

Code for our ACL2021 paper: "Check It Again: Progressive Visual Question Answering via Visual Entailment"
31 stars 6 forks source link

Question about SSL #3

Closed shonnon-zxs closed 3 years ago

shonnon-zxs commented 3 years ago

Use UPDN for the first 12 epochs of SSL and then Self-supervised. Why did you use Self-supervised from the beginning (in train.py)

PhoebusSi commented 3 years ago

In the first stage, if you choose the SSL as CAS, all the implementation is the same as the original SSL. At the second stage, when the SSL is combined with LXMERT, there is no need to "pre-train" for 12 epoches because LXMERT is the pre-trained transformer-based model which itself has a stronger ability of VQA. Origin SSL(based on UpDn, not LXMERT) starts with 12-epoch pretraining (without the self-supervised task) because it's necessary for a better VQA ability, which can be referred to its paper.