leeyeehoo / CSRNet-pytorch

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
642 stars 259 forks source link

Why do you don't use all architecture pretrained model VGG16 ? #68

Open ThanhNhann opened 4 years ago

ThanhNhann commented 4 years ago

I have read your paper and don't understand why you use the first ten layers of VGG-16 with only three pooling layers instead of all architecture pre-trained model VGG16 ? Thanks

doubbblek commented 4 years ago

I think the reason is that while doing crowd counting, we do not need deep features which contains semantic information. These semantic information might influence the performance since we mainly need shallower feature like edges.

ThanhNhann commented 4 years ago

@doubbblek Do you have a paper relevant mention about this? thanks for your answer