pmorerio / admd

Tensorflow code for the paper 'Learning with privileged information via adversarial discriminative modality distillation', TPAMI 2019
MIT License
10 stars 3 forks source link

train_hallucination vs train_hallucination_p2 for NYUD #8

Closed Scienceseb closed 4 years ago

Scienceseb commented 4 years ago

Im trying to write your code according to your paper, but I have some problem understanding the difference between train_hallucination and train_hallucination_p2. In your paper it is written: The discriminator also features an additional classification task, i.e. not only it is trained to discriminate between hallucinated and depth features, but also to assign samples to the correct class. But in your code for the discriminator loss of train_hallucination you just do with tf.reduce_mean(tf.square(self.logits_fake - tf.zeros_like(self.logits_fake))) so discriminate between hallucinated and depth features, while for train_hallucination_p2 you do tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=self.logits_real, labels=tf.one_hot(self.labels, self.no_classes + 1))) so assign samples to the correct class. I dont understand why you dont have a loss merging those two parts?

Would it be possible for you to better present how the loss for the train_hallucination are calculated for both the generator and the discriminator. Thank you.

pmorerio commented 4 years ago

Hi,

Results for NYUD are listed in the last two lines of Table 4.

Scienceseb commented 4 years ago

Thanks. Also seem you have an error in your paper for the discriminator...It's written For the task of action recognition, the structure is quite shallow, consisting in D1=[fc(2048), fc(1024), fc(C+1)]. For the task of object classification the structure is instead more complex D2=[fc(1024), fc(1024), fc(1024), fc(2048), fc(3072), fc(C+1)], with skip connections in the lower layers. But figure 4 presents the opposite...

Scienceseb commented 4 years ago

Hi,

  • train_hallucination is naive adversarial with square loss.
  • train_hallucination_p2 implements equations (1) and (2) in the paper. Please check them carefully. Note that the one-hot vector in the code has length self.no_classes + 1, meaning that the discriminator has to distinguish between hallucinated and depth features, but in the case of depth features it also has to select the correct class. The loss merges the adversarial problem and the classification problems by having self.no_classes + 1 classes, where the +1 accounts for the 'hallucination' class.

Results for NYUD are listed in the last two lines of Table 4.

Ok so train_hallucination (naive adversarial with square loss) is your older paper right?

pmorerio commented 4 years ago

Thanks. Also seem you have an error in your paper for the discriminator...It's written For the task of action recognition, the structure is quite shallow, consisting in D1=[fc(2048), fc(1024), fc(C+1)]. For the task of object classification the structure is instead more complex D2=[fc(1024), fc(1024), fc(1024), fc(2048), fc(3072), fc(C+1)], with skip connections in the lower layers. But figure 4 presents the opposite...

Yes you are right, the caption is inverted.

pmorerio commented 4 years ago

Ok so train_hallucination (naive adversarial with square loss) is your older paper right?

If you are referring the ECCV paper reference [11] the answer is no. That is the line above in Table 4.

Scienceseb commented 4 years ago

Thank you for your very quick answers!

pmorerio commented 4 years ago

Glad to help :)

Scienceseb commented 4 years ago

Oh I have one more question: in Fig. 4 for the last fc layer of the discriminator on the right the number is 3072, while in your code its 3076, what should I use ?

pmorerio commented 4 years ago

My mistake. 3072 should be the correct one, but 4 more neurons would certainly make no difference.