Open moetayuko opened 5 years ago
I don't think it's crucial whether there exist a relu function after the first conv.
The addition acts like a non-linearity already. It's a design choice, you can see the same in Cycle-GAN and pix-to-pixHD. It actually allows it to output negative values.
There's neither relu before the first conv https://github.com/znxlwm/UGATIT-pytorch/blob/f7a4b596a32349c54acc55007dce4ae4400bf125/networks.py#L118 nor after addition https://github.com/znxlwm/UGATIT-pytorch/blob/f7a4b596a32349c54acc55007dce4ae4400bf125/networks.py#L129 in the residual blocks, i.e., there're no activations between neighboring residual bocks.
I failed to find design details of residual blocks from the paper, but I guess it's a bug, can you confirm? Thanks.
PS: Just checked the official tf code, it suffers from the same problem.