cvlab-stonybrook / LearningToCountEverything

MIT License
357 stars 72 forks source link

The multi-scale feature extraction module consists of the first four blocks from a pre-trained ResNet-50 backbone #29

Closed ONEISALL-h closed 2 years ago

ONEISALL-h commented 2 years ago

In the proposed model using ResNet-50 backbone instead of another deeper network. Is there any specific reason to use ResNet-50?

Viresh-R commented 2 years ago

Hey, we use Resnet-50 because of its widespread usage and proven track record of its usefulness for a wide range of Computer Vision tasks. But we expect FamNet to work with other Imagenet pretrained backbones as well, particularly those backbones which well for other dense prediction tasks like semantic segmentation.

ONEISALL-h commented 2 years ago

Thanks for your timely response! You are right, ResNet-50 appears in many works and has better results. For ResNet, it has other depths, such as 18, 34, 101, 152. Is there a particular reason for choosing ResNet-50, or is it simply because many jobs have adopted it?

Viresh-R commented 2 years ago

Resnet-50 works better than Resnet-18 and 34 in our case. And we didn't use 101 and 152 because we wanted to do a fair comparison with other competing approaches which use Resnet-50.

jaideep11061982 commented 1 year ago

@Viresh-R could you share the pretrained weights for the model