Anjaney1999 / image-captioning-seqgan

An image captioning model that is inspired by the Show, Attend and Tell paper (https://arxiv.org/abs/1502.03044) and the Sequence Generative Adversarial Network paper (https://arxiv.org/abs/1609.05473)
22 stars 2 forks source link

where is adversarial loss for the generator? #3

Open srikanthmalla opened 3 years ago

srikanthmalla commented 3 years ago

Hi @Anjaney1999 , I was looking at your code and trying to find adversarial loss in the generator training scheme: https://github.com/Anjaney1999/image-captioning-seqgan/blob/10e60ad272070dd90f2900483325fccf60e7de3a/train_pg.py#L285

Can you let me know if it is used in your code? If not, it is need for GAN right? Please let me know.

Thank you, Srikanth

Anjaney1999 commented 3 years ago

Hey Srikanth, this GAN outputs discrete tokens, unlike regular GANs, so a regular adversarial loss will not work. To train the model, Policy Gradient is used, where feedback is given by the discriminator: https://github.com/Anjaney1999/image-captioning-seqgan/blob/10e60ad272070dd90f2900483325fccf60e7de3a/train_pg.py#L294-L296

For more information, you can refer to https://arxiv.org/pdf/1609.05473.pdf

Also, if you have any other questions, I will try my best to explain:)

I keep procrastinating and end up not writing a proper readme for this repo, but I aim to do that soon. Overall, the model takes ages to train and improvements in performance are not huge; however, it was an excellent learning opportunity for me.