Why not use the tanh activation at the last layer?

orashi commented 7 years ago

Normally we add tanh after the last convolution to get an output within range -1~1, I just wonder if there is any particular reason not using it anywhere in this architecture.

SeungjunNah commented 7 years ago

For image regression case, people don't usually add nonlinear activation layer at the end of the model. For image deblurring case, you have an input and target pair that has similar value and they share some structural information. However, if you apply tanh activation at the end of the model, your model has to estimate the target that is nonlinearly stretched from arctanh transformation. This makes target go far away from input. Furthermore, it makes the problem even more difficult. For example, your model should output infinity (just before tanh) to get a white pixel. It is not worth restricting the output range with tanh at the expense of more difficult problem. Also note that in our implementation, the output range of last convolutional layer is expected (not guaranteed, though) to have range [-0.5, 0.5], not (-1, 1).

orashi commented 7 years ago

Thanks for the reply, and correcting my misunderstandings that lasted for a long while...Now things just make more sense.

SeungjunNah / DeepDeblur_release

Why not use the tanh activation at the last layer? #4