The architecture of ICNN

bunnech / cellot

Learning Single-Cell Perturbation Responses using Neural Optimal Transport

BSD 3-Clause "New" or "Revised" License

109 stars 9 forks source link

The architecture of ICNN #6

Closed Tony4709 closed 9 months ago

Tony4709 commented 9 months ago

In the code, I see that the forward function of ICNN is defined like this: `
def forward(self, x):

   z = self.sigma(0.2)(self.A[0](x))
   z = z * z

   for W, A in zip(self.W[:-1], self.A[1:-1]):
       z = self.sigma(0.2)(W(z) + A(x))

   y = self.W[-1](z) + self.A[-1](x)

   return y

I think there are two places that are inconsistent with the formula in article. (i) Why should we make z=z*z in the first layer? (ii) Why no non-negative activation function is added to the last layer.

Thank you!

bunnech commented 9 months ago

Dear Tony,

we used (and only slightly adapted) the ICNN implementation of the original work by Makkuva et al. (2020), where a similar operation (z = z * z) can be found, see https://github.com/AmirTag/OT-ICNN/blob/master/2_dim_experiments/W2-minimax-tf.py#L307.

Regarding your second concern: We want to learn a map $\mathbb{R}^d \rightarrow \mathbb{R}^d$ without restricting the output values to the range of the activation function. That's why in such cases commonly no non-negative activation function is added to the last layer.

Hope this answers your questions!

Tony4709 commented 9 months ago

Dear Tony,

we used (and only slightly adapted) the ICNN implementation of the original work by Makkuva et al. (2020), where a similar operation (z = z * z) can be found, see https://github.com/AmirTag/OT-ICNN/blob/master/2_dim_experiments/W2-minimax-tf.py#L307.

Regarding your second concern: We want to learn a map Rd→Rd without restricting the output values to the range of the activation function. That's why in such cases commonly no non-negative activation function is added to the last layer.

Hope this answers your questions!

I got it. Thanks for your reply!