DeepRNN / image_captioning

Tensorflow implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
MIT License
785 stars 354 forks source link

Questions about training details #37

Open Honlan opened 6 years ago

Honlan commented 6 years ago

Hi, I' am trying to reproduce your work.

May I ask, how much are the totoal_loss and accuracy after training? I train the model for 60 epochs on 1/10 of the train data, and get a total_loss of about 1.6, an accuracy of about 65%, but when generating captions for the test images, the model just repeats all the same word, quite strange!!!

Any ideas? Thanks very much.