HRNet / HRNet-Object-Detection

Object detection with multi-level representations generated from deep high-resolution representation learning (HRNetV2h). This is an official implementation for our TPAMI paper "Deep High-Resolution Representation Learning for Visual Recognition". https://arxiv.org/abs/1908.07919
Apache License 2.0
643 stars 98 forks source link

How to solve the fuse dimension mismatch problem when feature map is from an odd side length one by stride 2 Convs? #48

Open GA-17a opened 3 years ago

GA-17a commented 3 years ago

@leoxiaobin @sunke123 @bearcatt @wondervictor @bowenc0221 Hi, thanks for your work. I meet a problem when I try implementing HRNet backbone on Faster R-CNN with Detectron2. It's about "_make_fuse_layers(self)". When feature map is from an odd side length one by stride 2 Convs, then we upsample it by "nn.Upsample(scale_factor=2**(j-i), mode='nearest')))" to make fuse, it will cause dimension mismatch problem. This problem can easily occur in object detection tasks due to the wide variety of resolutions. Do you solve this problem by preprocessing which may be detrimental to performance, or something else? And do you have any better solutions? Thank you very much and I am looking forward to your reply.