Object detection with multi-level representations generated from deep high-resolution representation learning (HRNetV2h). This is an official implementation for our TPAMI paper "Deep High-Resolution Representation Learning for Visual Recognition". https://arxiv.org/abs/1908.07919
Apache License 2.0
643
stars
98
forks
source link
How to solve the fuse dimension mismatch problem when feature map is from an odd side length one by stride 2 Convs? #48
@leoxiaobin @sunke123 @bearcatt @wondervictor @bowenc0221 Hi, thanks for your work. I meet a problem when I try implementing HRNet backbone on Faster R-CNN with Detectron2. It's about "_make_fuse_layers(self)". When feature map is from an odd side length one by stride 2 Convs, then we upsample it by "nn.Upsample(scale_factor=2**(j-i), mode='nearest')))" to make fuse, it will cause dimension mismatch problem. This problem can easily occur in object detection tasks due to the wide variety of resolutions. Do you solve this problem by preprocessing which may be detrimental to performance, or something else? And do you have any better solutions? Thank you very much and I am looking forward to your reply.
@leoxiaobin @sunke123 @bearcatt @wondervictor @bowenc0221 Hi, thanks for your work. I meet a problem when I try implementing HRNet backbone on Faster R-CNN with Detectron2. It's about "_make_fuse_layers(self)". When feature map is from an odd side length one by stride 2 Convs, then we upsample it by "nn.Upsample(scale_factor=2**(j-i), mode='nearest')))" to make fuse, it will cause dimension mismatch problem. This problem can easily occur in object detection tasks due to the wide variety of resolutions. Do you solve this problem by preprocessing which may be detrimental to performance, or something else? And do you have any better solutions? Thank you very much and I am looking forward to your reply.