Open mojivalipour opened 3 years ago
it is used by default so you don't have to specify it. The loss is implemented at https://github.com/intersun/LightningDOT/blob/5f2880f69ba87b8701ab89348d70ebb11432578c/dvl/utils.py#L114.
Now I'm quite confused with how this repository's files are structured. I thought pretrain.py is the file to pre-train your lightningDOT model. However, it does not appear that the _calc_loss function has been called anywhere in the code. However, train_itm.py uses this function several times. Therefore, can you please provide me with specific instructions on how to reproduce lightningDOT paper results?
Is that the case that pretrain.py is only provided to pre-train the UNITER model? If not then what's the usage of train_itm.py?
I totally agree it is indeed confusing since we didn't have time to clean the code. As you may noticed we left lots of other stuff that are not mentioned in the paper such as hard negatives, knowledge distillation and etc.
To answer your question (for pre-training only, I assume you already figured out how to use the loss for fine-tuning), if you trace the definition of pre-trained model (https://github.com/intersun/LightningDOT/blob/5f2880f69ba87b8701ab89348d70ebb11432578c/pretrain.py#L313), you will notice the relevant forward function is defined at https://github.com/intersun/LightningDOT/blob/5f2880f69ba87b8701ab89348d70ebb11432578c/dvl/models/bi_encoder.py#L484.
To answer your second question, pretrain.py is solely for pre-training and train_itm is solely for fine-tuning. Currently I don't have time to merge them, and that is definitely confusing....
I see, thank you. So, essentially your itm implementation is different from the original implementation of itm in UNITER. And yours is based on the CMR in the paper. In fact, itm_loss1 is the same image-retrieval loss, and itm_loss2 is the same text retrieval-loss in the article. Just one more question, what is that ot_loss?
correct... since I implemented fine-tuning first, and later found out changing it into pre-training is not trivial, I ended up implementing pre-training and fine-tuning separately...
OT loss refers to loss proposed in http://proceedings.mlr.press/v119/chen20e.html. I never tried it though so not sure how it will work.
Can you point me to the place in your code where CMR is implemented? You used CMR + VMLM + SMRM for the pre-training, according to the paper. However, CMR is not part of your supported tasks. Am I missing something?