Open Vincent-Ww opened 2 years ago
I think only CE. I am replicating the results and get the same checkpoint after CE optimisation. Further, the authors specify 30 epochs for CE optimisation, so should be it.
@Vincent-Ww @ignasa007 Hi, I run the inference script with checkpoint-29-66420 and the result is weird, do you meet the same problem? I download the checkpoint 29-66420 using the same link. Thanks! This is the command line I wrote: python oscar/run_captioning_official.py --do_test --do_eval --test_yaml test.yaml --per_gpu_eval_batch_size 64 --num_beam 5 --max_gen_length 20 --eval_model_dir /home/skn/Oscar-master/output_ceshi/checkpoint-29-66420
Looking forward to your reply!
From this page https://github.com/microsoft/Oscar/blob/master/MODEL_ZOO.md, For Oscar on image captioning task, there is only one checkpoint model: checkpoint-29-66420. This checkpoint model is trained by only cross-entropy or it's finetuned by cross-entropy and CIDER optim?
When I run the inference script with checkpoint-29-66420 on MSCOCO, I got the results:
{'Bleu_1': 0.7559338836672688, 'Bleu_2': 0.6008669959375728, 'Bleu_3': 0.46893216244997465, 'Bleu_4': 0.3658244160093896, 'METEOR': 0.3040389540024913, 'ROUGE_L': 0.5856375658366109, 'CIDEr': 1.2412284516798686, 'SPICE': 0.231772161443854}
So I guess checkpoint-29-66420 model is only pre-trained by cross-entropy but without CIDEr optim, is that correct?