Closed RanTaimu closed 5 years ago
2. se GAP to perform "downsample" operation before applying the weight $\mathbf{w}$, so that the actual quantity of the parameters can be reduced to 1xC?
Thanks!
You are right.
Actually, GAP can be replaced with other types of downsampling (including image resizing) without losing accuracy. We employ GAP here to save memory, computational time, and PARAMETERs.
@wanggrun Thanks for your reply!
Hi. Thanks for your excellent work! I have some questions about the pixel-aware implementation:
https://github.com/wanggrun/Adaptively-Connected-Neural-Networks/blob/c39928837ca5441820f64d82cc20d4dff410776a/cnn/pixel-aware/resnet_model.py#L89-L113
The paper says "For the pixel-aware connection, we let α = 0, β = 1 and only learn γ to save parameters and memory". So l_3x3 is the second item in formula (1), whose param β is fixed to 1. Am I right?
l_concat is the "γ" in the third item, which applies 1x1 conv to perform the two linear transformations in formula (3). My main question is about l_gap. Do you mean that you use GAP to perform "downsample" operation before applying the weight $\mathbf{w}$, so that the actual quantity of the parameters can be reduced to 1xC?