zhaoweicai / mscnn

Caffe implementation of our multi-scale object detection framework
404 stars 211 forks source link

How to modify the feature extraction network? #70

Open hkzhang95 opened 7 years ago

hkzhang95 commented 7 years ago

Hi, I have used your great method for several days, and now I have two questions to consult you:

  1. Your method is made for KITTI dataset, and you resize the image to 2560x768 resolution. I have tried your ped-8s-768 method for my 2048x1536 resolution images and it works fine. But I think it resizes my image to uncomfortable aspect ratio. I have tried to modify the resize_width and resize_height to 2048 and 1536, but this leads to a lower result. So I want to know if I change the image size, which parameters should I modify ?

  2. The base-net in your method is VGG16, I think it is slow. So could you tell me how to change the base-net in your method? Such as using PVANet ?

Thank you very much !

zhaoweicai commented 7 years ago
  1. 2560x768 is the size after upsample the original image by 2 times. There is no need to change the aspect ratio of the images. Just keep the aspect ratio and upsample the image to the size you want. One small suggestion is to keep the sizes as the multiples of 32. You may need to change the anchor size as well corresponding to your input upsampling.
  2. I think there is no limit to extend the idea to other backbone network. However, different networks have different properties. I can't make sure it will work very well for other networks.