wyf0912 / LDDG

[NIPS 2020] The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization'
61 stars 9 forks source link

Proof for Theorem 2 #3

Closed cnyanhao closed 3 years ago

cnyanhao commented 3 years ago

Hi, thank you for your great paper. I'm sorry that I can't understand parts of the proof for theorem 2.

(1) Here you assume L() denotes the cross-entropy loss with softmax operation. Does that mean y is the input of the softmax function? Because only in this case the linear relation between the target y and the source y's holds.

(2) I'm also confused how you derive from the first line to the second line (including how you take the sum over j outside of the loss function and how you change p* to pj).

(3) What's more, you said you used the upper bound of log-sum-exp function to derive the second line. And {a1, a2, ..., an} are the outputs of the softmax function. Then I'm not sure where the exponential comes in the log-sum-exp function that takes the exponential of a.

Thank you for your kind reply.

image

wyf0912 commented 3 years ago

Hi, thanks for your interest.

1) Yes, y is the input of the softmax, i.e., y has not undergone softmax operation. 2) The derivation from the first line to the second uses the property of the convex function 3) The exponential comes from the softmax operation. You can refer to the link https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html?highlight=cross%20entrop#torch.nn.CrossEntropyLoss