QiZhao-NJU / Neural-Representation-for-Video-via-Differential-Input-and-Pyramidal-Architecture

Neural Representation for Video via Differential Input and Pyramidal Architecture
MIT License
7 stars 0 forks source link

The number of parameters of KFc for PNeRV #3

Closed th359 closed 1 month ago

th359 commented 2 months ago

Thank you for releasing the code for PNeRV. I have a question about the advantage of the number of KFc parameters in PNeRV compared to PixelShuffle.

In your paper, you compare the upsampling from[16x2x4]->[16x320x640]between PixelShuffle and KFc as follows PixelShuffle: 6.96M, KFc: 0.05M where kernel size=1.

However, in HNeRV, for example, upsampling is provided by multiple PixelShuffle with smaller upscaling. When the upscale_factor of PixelShuffle is set to [5, 4, 2, 2, 2], the number of parameters is reduced to 0.01M (11356) Param.

Can you please provide more information on the advantages of using KFc?

Thank you.

QiZhao-NJU commented 1 month ago

The advantage of KFc lies in its highly efficient transformation process from very low-resolution feature maps to high-resolution feature maps. Besides, the computation process of KFc involves modeling the global information of the feature maps, which is difficult for Pixelshuffle to achieve. PixelShuffle only focuses on the local range of the target location, and the larger the range, the higher the computational cost.

th359 commented 1 month ago

Thanks for the reply. I understood that PixelShuffle requires a parameter of 6.96M for a wide focus equivalent to KFc.