Closed Merenguelkl closed 2 years ago
Some points summarized from my experiments, which are not always correct:
I think there is a trade-off here between the network's capability to extract semantics and details, with GELU being better at extracting semantics and ReLU being better at filtering details, so different networks behave differently.
Thanks for your reply! I will try it in my experiments :)
Hi, thanks for your amazing work.
I'm very interested in your study of 'Nonlinear activation functions'.
Paper says changing GELU to ReLU is effective in dehazing. Does it performs well on other low-level tasks, e.g. denoising and debluring?