Closed hcn323 closed 2 years ago
the loss in the paper is: but in code: you just take out the probability of the item corresponding to the label,and use -tf.math.log() to get the predloss,$\sum{c=1}^C$ I don't think it shows up in code.
Other cases of c are 0. Please refer to the implementation of cross entropy.
the loss in the paper is: but in code: you just take out the probability of the item corresponding to the label,and use -tf.math.log() to get the predloss,$\sum{c=1}^C$ I don't think it shows up in code.