Open mohamedr002 opened 3 years ago
HI,
I guess the label for `real_logit_disc' is 1, as in
loss_disc = 0.5 * (
sigmoid_xent(real_logit_disc, torch.ones_like(real_logit_disc, device='cuda')) +
sigmoid_xent(fake_logit_disc, torch.zeros_like(fake_logit_disc, device='cuda'))
)
, and the label for 'real_logit' is 0 analogously.
So the codes are doing min-max optimization in an alternative way.
Thanks to you for these well-organized codes. But I believe there is one issue with updating the discrimination loss. Particularly, in line 63 and 64 you https://github.com/ozanciga/dirt-t/blob/f2c5f6984447d63e560b4fb5a04ccf80535c9fb1/vada_train.py#L63-L64 https://github.com/ozanciga/dirt-t/blob/f2c5f6984447d63e560b4fb5a04ccf80535c9fb1/vada_train.py#L83-L84 The discriminator will predict the domain labels either zero or one. using real_logits_disc and fake_logit_disc. But in line 83 you repeat the same process without pre updating the discriminator optimizer. Hence, real_logits_disc and real_logits will be exactly the same? And I don't think this what we aim for. As we aim to first update the discriminator with real logits and then update the classifier model to confuse the discriminator using fake logits.