Open Ellohiye opened 8 months ago
Thanks for pointing this problem out. To clarify the expression, we replace scale-invariant features' with
aspect ratio invariant features'. As you know, each learned filter of Convolutional neural networks (CNNs), including upsampling and downsampling, is sensitive to a given set of features only within a narrow range of scale. During the process, all images are resized, but the aspect ratio, which is the ratio of their width to height, is preserved.
"In SSFF, The P3, P4, and P5 feature maps are normalized to the same size, upsampled, and then stacked together as input to a 3D convolution to combine multiscale features." See Page 6.