NVlabs / DG-Net

:couple: Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral) :couple:
https://www.zdzheng.xyz/publication/Joint-di2019
Other
1.27k stars 230 forks source link

Reconstruction id embedding loss after detach #49

Closed juliendenize closed 3 years ago

juliendenize commented 4 years ago

Hello, in your ft_netAB id encoder defined in reIDmodel.py you detach the id_embedding:

f = self.model.partpool(x)
f = f.view(f.size(0),f.size(1)*self.
f = f.detach() # no gradient 

In the gen_update loss function in trainer.py you calculate the id embedding reconstruction loss (as specified in your paper) but you are using the detached embeddings:

self.loss_gen_recon_f_a = self.recon_criterion(f_a_recon, f_a) if hyperparameters['recon_f_w'] > 0 else 0
self.loss_gen_recon_f_b = self.recon_criterion(f_b_recon, f_b) if hyperparameters['recon_f_w'] > 0 else 0 

Because the embeddings are detached these losses are not constraining the model, did I miss something ?

shuxjweb commented 3 years ago

In the file of 'configs/latest.yaml', recon_s_w and recon_f_w are set 0, this is strange. Are the two reconstruction loss not used during training?

juliendenize commented 3 years ago

@shuxjweb in the trainer you can check that at each iteration after several thousands iters the weights of these losses are increased up to a certain weight

layumi commented 3 years ago

Hi @juliendenize , Sorry for the late response.

Yes. We detached the id embedding to prevent the gradient to E_appearance, which may compromise the re-id performance. The gradient of loss_gen_recon_f_a mainly works for optimising the decoder.

layumi commented 3 years ago

Hi @shuxjweb

Sorry for the late reply. Actually, we use the losses in one warm-up manner., as @juliendenize mentioned.

juliendenize commented 3 years ago

@layumi thank you for your answer. Indeed, I tried to remove the detach function and the GAN collapsed, so I get why you did that and it is truely inspiring. However, I don't see how the gradient of loss_gen_recon_f_a would optimize the decoder because both f_a_recon and f_a are detached.

layumi commented 3 years ago

Thanks. @juliendenize Yes. You are right. The gradient of loss_gen_recon_f_a would not optimize the decoder, due to the detach. I think I just left the api for ablation study two years ago.

juliendenize commented 3 years ago

Thank you for your answers and your work