Hi, I used your pretrained model for Flickr30k Dataset. However, the performance is very bad than the performance on paper.
Bleu_1: 0.206
Bleu_2: 0.122
Bleu_3: 0.077
Bleu_4: 0.051
computing METEOR score...
METEOR: 0.108
computing Rouge score...
ROUGE_L: 0.278
computing CIDEr score...
CIDEr: 0.377
computing SPICE score...
Parsing reference captions
Parsing test captions
SPICE evaluation took: 3.287 s
SPICE: 0.161
Could you please check the pretrained model ?
By the way, after creating h5 and json file for flickr30k dataset, parameter number did not match with the pretrained model. Thus, I had to take word_count_threshold 4.
Hi, I used your pretrained model for Flickr30k Dataset. However, the performance is very bad than the performance on paper. Bleu_1: 0.206 Bleu_2: 0.122 Bleu_3: 0.077 Bleu_4: 0.051 computing METEOR score... METEOR: 0.108 computing Rouge score... ROUGE_L: 0.278 computing CIDEr score... CIDEr: 0.377 computing SPICE score... Parsing reference captions Parsing test captions SPICE evaluation took: 3.287 s SPICE: 0.161 Could you please check the pretrained model ? By the way, after creating h5 and json file for flickr30k dataset, parameter number did not match with the pretrained model. Thus, I had to take word_count_threshold 4.