Closed Tony4709 closed 9 months ago
Dear Tony,
we used (and only slightly adapted) the ICNN implementation of the original work by Makkuva et al. (2020), where a similar operation (z = z * z
) can be found, see https://github.com/AmirTag/OT-ICNN/blob/master/2_dim_experiments/W2-minimax-tf.py#L307.
Regarding your second concern: We want to learn a map $\mathbb{R}^d \rightarrow \mathbb{R}^d$ without restricting the output values to the range of the activation function. That's why in such cases commonly no non-negative activation function is added to the last layer.
Hope this answers your questions!
Dear Tony,
we used (and only slightly adapted) the ICNN implementation of the original work by Makkuva et al. (2020), where a similar operation (
z = z * z
) can be found, see https://github.com/AmirTag/OT-ICNN/blob/master/2_dim_experiments/W2-minimax-tf.py#L307.Regarding your second concern: We want to learn a map Rd→Rd without restricting the output values to the range of the activation function. That's why in such cases commonly no non-negative activation function is added to the last layer.
Hope this answers your questions!
I got it. Thanks for your reply!
In the code, I see that the forward function of ICNN is defined like this: `
def forward(self, x):
`
I think there are two places that are inconsistent with the formula in article. (i) Why should we make z=z*z in the first layer? (ii) Why no non-negative activation function is added to the last layer.
Thank you!