microsoft / Oscar

Oscar and VinVL
MIT License
1.04k stars 252 forks source link

what is checkpoint-29-66420 #191

Open Vincent-Ww opened 2 years ago

Vincent-Ww commented 2 years ago

From this page https://github.com/microsoft/Oscar/blob/master/MODEL_ZOO.md, For Oscar on image captioning task, there is only one checkpoint model: checkpoint-29-66420. This checkpoint model is trained by only cross-entropy or it's finetuned by cross-entropy and CIDER optim?

When I run the inference script with checkpoint-29-66420 on MSCOCO, I got the results: {'Bleu_1': 0.7559338836672688, 'Bleu_2': 0.6008669959375728, 'Bleu_3': 0.46893216244997465, 'Bleu_4': 0.3658244160093896, 'METEOR': 0.3040389540024913, 'ROUGE_L': 0.5856375658366109, 'CIDEr': 1.2412284516798686, 'SPICE': 0.231772161443854}

So I guess checkpoint-29-66420 model is only pre-trained by cross-entropy but without CIDEr optim, is that correct?

ignasa007 commented 2 years ago

I think only CE. I am replicating the results and get the same checkpoint after CE optimisation. Further, the authors specify 30 epochs for CE optimisation, so should be it.

Nidadadadada commented 2 years ago

@Vincent-Ww @ignasa007 Hi, I run the inference script with checkpoint-29-66420 and the result is weird, do you meet the same problem? I download the checkpoint 29-66420 using the same link. Thanks! 微信图片_20221113201654 微信图片_20221113201659 This is the command line I wrote: python oscar/run_captioning_official.py --do_test --do_eval --test_yaml test.yaml --per_gpu_eval_batch_size 64 --num_beam 5 --max_gen_length 20 --eval_model_dir /home/skn/Oscar-master/output_ceshi/checkpoint-29-66420

Looking forward to your reply!