about output loss definition

dhruvramani / C2AE-Multilabel-Classification

Tensorflow implementation for the paper 'Learning Deep Latent Spaces for Multi-Label Classfications' in AAAI 2017

101 stars 24 forks source link

about output loss definition #8

Closed hinanmu closed 4 years ago

hinanmu commented 5 years ago

Hi,I have a question in your code about loss definition. I notice that the prediction of the output loss in the C2AE paper is defined as F_d(F_e(x_i)). But it is defined as F_d(F_x(x_i)) in your code.

Is it a mistake in the paper? I don't know why F_e function can accept x_i, they have a different dimension. Thank you!

greeness commented 5 years ago

I think there is a mistake in the paper.

In the training time, to get the loss between Fe and Fd, we need to calculate Fd(Fe(y_i)) (Not x_i, because as you pointed out, Fe cannot take x_i as input but y_i).

Probably because of this mistake, dhruvramani@ used a different way to derive the loss. It is probably working too but not as good as the one proposed by the original paper.

In the prediction time, to predict labels, we would do

Fd(Fx(x_i)) this is consistent between the formula from the paper and the implementation here.

greeness commented 5 years ago

I found out that in the latest version of the paper hosted on arXiV . That mistake is already fixed. Now equation (3) cotains Fd(Fe(y_i)) instead of Fd(Fe(x_i)).

See https://arxiv.org/pdf/1707.00418.pdf

hinanmu commented 5 years ago

Thanks, I got it.

chihkuanyeh commented 5 years ago

@greeness thanks for the clarification.