microsoft / FIBER

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
MIT License
128 stars 11 forks source link

Cider optimization #6

Closed mactavish91 closed 1 year ago

mactavish91 commented 1 year ago

Hello, when I run the function compute_caption_cider, the rl loss has been kept at 20-80, and the cider score has not increased after training, do you know the reason?

zdou0830 commented 1 year ago

Hi, rl loss may not be a good indicator of the model performance. if you keep the hyperparameters unchanged and finetune an MLE-trained model for at least 1 epoch, you should be able to get noticeable improvements in cider

mactavish91 commented 1 year ago

Hi, rl loss may not be a good indicator of the model performance. if you keep the hyperparameters unchanged and finetune an MLE-trained model for at least 1 epoch, you should be able to get noticeable improvements in cider

rl_loss = rl_probs.view(-1) * (
            100.0 - 100.0 * torch.tensor(batch_cider_scores, device=rl_probs.device).view(-1)
        )

@zdou0830 Thank you very much, could you tell me why to subtract the following variable with 100.0?When I debugged, I found that the cider scores ranged from 0.6 to 1.9.

zdou0830 commented 1 year ago

You can subtract with any reasonable baseline value and it shouldn't have much effect on the final performance.

Markin-Wang commented 1 year ago

Hi, thanks for your work and code. May I ask how to get or generate the 'coco-train-words.p' used in Cider optimization?

zdou0830 commented 1 year ago

its here. I got it from the VinVL codebase https://drive.google.com/file/d/1N_O7tkBjJRCueQj7MHMqrazDTWr4OCDE/view?usp=sharing

Markin-Wang commented 1 year ago

its here. I got it from the VinVL codebase https://drive.google.com/file/d/1N_O7tkBjJRCueQj7MHMqrazDTWr4OCDE/view?usp=sharing

Thanks for your quick reply, it is much helpful.