Object detection with multi-level representations generated from deep high-resolution representation learning (HRNetV2h). This is an official implementation for our TPAMI paper "Deep High-Resolution Representation Learning for Visual Recognition". https://arxiv.org/abs/1908.07919
Hi, thanks for the great work. I do have a question on the upsample method. In the paper on mult-scale feature fusion, the upsample is "bilinear" followed by the 1x1 conv, but in the implementation I saw the Upsample kwarg is "nearest". Do I miss or miss-understand anything here? Btw, will it influence the feature representation much (based on your intuition or experiments).
Hi, thanks for the great work. I do have a question on the upsample method. In the paper on mult-scale feature fusion, the upsample is "bilinear" followed by the 1x1 conv, but in the implementation I saw the Upsample kwarg is "nearest". Do I miss or miss-understand anything here? Btw, will it influence the feature representation much (based on your intuition or experiments).