232525 / PureT

Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
63 stars 12 forks source link

Question about end-to-end training #11

Open HHHH17 opened 1 year ago

HHHH17 commented 1 year ago

Dear Author, I find that the params in backbone are freezed in the code you give. I want to train model in an end-to-end manner. So, after 20 epoch freeze XE training, I unfreeze the backbone, then the loss start to rise and all the evaluation indicators began to decline. Later I try to end-to-end training from scratch, but all the evaluation indicators have little changed since the first epoch (CIDER 103 after 20 epoch for XE training end-to-end. As a comparison, 122 for 20 epoch for freeze training). Do you have any idea what is the reason for this phenomenon? And Does your strategy when training end-to-end differ from the code you give?

232525 commented 1 year ago

Sorry, I am not sure why. Loading pre-trained backbone aims to extract high-quality visual features, if you want to fine-tune the backbone weights, maybe setting a smaller lr for backbone weights or just fine-tuning the last few layers are more robust choices.

HHHH17 commented 1 year ago

Sorry, I am not sure why. Loading pre-trained backbone aims to extract high-quality visual features, if you want to fine-tune the backbone weights, maybe setting a smaller lr for backbone weights or just fine-tuning the last few layers are more robust choices.

Dear Author. In the paper, you mentioned that the model can training end-to-end. But there are not an experiment about the influence of end-to-end. I would the to know how the result(CIDER 138.2 for single model) is obtained, end-to-end training or freeze the swin-transformer backbone? If you training in an end-to-end way would you mind to tell me the specific experimental setup?

232525 commented 1 year ago

The latter, we freeze the backbone weights in all experiments.

HHHH17 commented 1 year ago

Have you tried training the entire model end-to-end? If so, can it improve the results?