Open YingqianWang opened 3 years ago
I think if we will use bilinear interpolation we will fall in the issue that it smooth features (due to always linear combination based on distance). This is one of the main drawbacks when using interpolation in semantic segmentation.
When using local ensemble features are just stacked and following imnet (mlp) can combine them with non-linear weights.
In Fig. 2, the authors propose a local ensemble approach to predict the RGB value of the target position based on its four nearest neighbors. However, the calculation process of the local ensemble is very similar to that of bilinear interpolation. Then I have a question: why not directly using bilinear interpolation in the F.grid_sample function? Looking forward to your early reply.
Regards