232525 / PureT

Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
63 stars 12 forks source link

About the baseline model #12

Closed ZihuaEvan closed 1 year ago

ZihuaEvan commented 1 year ago

Dear author: Could you provide a training configs for the baseline model? In the paper, Transformer without pre-fusion reaches 136.4 in CIDEr, and Transformer with pre-fusion reaches 137.5. Thanks!

232525 commented 1 year ago

I am not sure if I can still find the relevant files. Actually, the GPU machine (for these experiments) got some error some time ago and I can not check the backup of this machine now because I am not in the lab recently.

ZihuaEvan commented 1 year ago

Thank you all the same.

LeXueTao commented 7 months ago

Dear author: Could you provide a training configs for the baseline model? In the paper, Transformer without pre-fusion reaches 136.4 in CIDEr, and Transformer with pre-fusion reaches 137.5. Thanks!

Hello, have you successfully replicated the results of the Transformer model?,I train it with xe 0.49, scst 134. It's so mysterious.