Closed 421zuoduan closed 1 year ago
Hi there,
Thank you for your work. I have a question about MSFN. In most models that use Mlp as the FFN, only one activation function is used in the Mlp. However, in MSFN, two activation functions are used (not considering multi-scale). Is this due to the characteristics of convolution? Is there any prior experience or literature that could be referenced for this?
Thanks for your question. We designed the MSFN inspired by prior CNN literature on 'Multi-scale Residual Network for Image Super Resolution'. We do not explore the impact of the activation function. Thank you for your advice, and we will explore this in the future.
Thanks for your reply.
Hi there,
Thank you for your work. I have a question about MSFN. In most models that use Mlp as the FFN, only one activation function is used in the Mlp. However, in MSFN, two activation functions are used (not considering multi-scale). Is this due to the characteristics of convolution? Is there any prior experience or literature that could be referenced for this?