Closed Scienceseb closed 4 years ago
Hi,
train_hallucination
is naive adversarial with square loss.
train_hallucination_p2
implements equations (1) and (2) in the paper. Please check them carefully. Note that the one-hot vector in the code has length self.no_classes + 1
, meaning that the discriminator has to distinguish between hallucinated and depth features, but in the case of depth features it also has to select the correct class. The loss merges the adversarial problem and the classification problems by having self.no_classes + 1
classes, where the +1
accounts for the 'hallucination' class.
Results for NYUD are listed in the last two lines of Table 4.
Thanks. Also seem you have an error in your paper for the discriminator...It's written For the task of action recognition, the structure is quite shallow, consisting in D1=[fc(2048), fc(1024), fc(C+1)]. For the task of object classification the structure is instead more complex D2=[fc(1024), fc(1024), fc(1024), fc(2048), fc(3072), fc(C+1)], with skip connections in the lower layers. But figure 4 presents the opposite...
Hi,
train_hallucination
is naive adversarial with square loss.train_hallucination_p2
implements equations (1) and (2) in the paper. Please check them carefully. Note that the one-hot vector in the code has lengthself.no_classes + 1
, meaning that the discriminator has to distinguish between hallucinated and depth features, but in the case of depth features it also has to select the correct class. The loss merges the adversarial problem and the classification problems by havingself.no_classes + 1
classes, where the+1
accounts for the 'hallucination' class.Results for NYUD are listed in the last two lines of Table 4.
Ok so train_hallucination (naive adversarial with square loss) is your older paper right?
Thanks. Also seem you have an error in your paper for the discriminator...It's written For the task of action recognition, the structure is quite shallow, consisting in D1=[fc(2048), fc(1024), fc(C+1)]. For the task of object classification the structure is instead more complex D2=[fc(1024), fc(1024), fc(1024), fc(2048), fc(3072), fc(C+1)], with skip connections in the lower layers. But figure 4 presents the opposite...
Yes you are right, the caption is inverted.
Ok so train_hallucination (naive adversarial with square loss) is your older paper right?
If you are referring the ECCV paper reference [11] the answer is no. That is the line above in Table 4.
Thank you for your very quick answers!
Glad to help :)
Oh I have one more question: in Fig. 4 for the last fc layer of the discriminator on the right the number is 3072, while in your code its 3076, what should I use ?
My mistake. 3072 should be the correct one, but 4 more neurons would certainly make no difference.
Im trying to write your code according to your paper, but I have some problem understanding the difference between train_hallucination and train_hallucination_p2. In your paper it is written: The discriminator also features an additional classification task, i.e. not only it is trained to discriminate between hallucinated and depth features, but also to assign samples to the correct class. But in your code for the discriminator loss of train_hallucination you just do with tf.reduce_mean(tf.square(self.logits_fake - tf.zeros_like(self.logits_fake))) so discriminate between hallucinated and depth features, while for train_hallucination_p2 you do tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=self.logits_real, labels=tf.one_hot(self.labels, self.no_classes + 1))) so assign samples to the correct class. I dont understand why you dont have a loss merging those two parts?
Would it be possible for you to better present how the loss for the train_hallucination are calculated for both the generator and the discriminator. Thank you.