Closed orashi closed 7 years ago
For image regression case, people don't usually add nonlinear activation layer at the end of the model. For image deblurring case, you have an input and target pair that has similar value and they share some structural information. However, if you apply tanh activation at the end of the model, your model has to estimate the target that is nonlinearly stretched from arctanh transformation. This makes target go far away from input. Furthermore, it makes the problem even more difficult. For example, your model should output infinity (just before tanh) to get a white pixel. It is not worth restricting the output range with tanh at the expense of more difficult problem. Also note that in our implementation, the output range of last convolutional layer is expected (not guaranteed, though) to have range [-0.5, 0.5], not (-1, 1).
Thanks for the reply, and correcting my misunderstandings that lasted for a long while...Now things just make more sense.
Normally we add tanh after the last convolution to get an output within range -1~1, I just wonder if there is any particular reason not using it anywhere in this architecture.