Performance on Flickr30k Dataset

Hi, I used your pretrained model for Flickr30k Dataset. However, the performance is very bad than the performance on paper. Bleu_1: 0.206 Bleu_2: 0.122 Bleu_3: 0.077 Bleu_4: 0.051 computing METEOR score... METEOR: 0.108 computing Rouge score... ROUGE_L: 0.278 computing CIDEr score... CIDEr: 0.377 computing SPICE score... Parsing reference captions Parsing test captions SPICE evaluation took: 3.287 s SPICE: 0.161 Could you please check the pretrained model ? By the way, after creating h5 and json file for flickr30k dataset, parameter number did not match with the pretrained model. Thus, I had to take word_count_threshold 4.

jiasenlu / AdaptiveAttention

Performance on Flickr30k Dataset #25