dk-liang / FIDTM

[IEEE TMM] Focal Inverse Distance Transform Maps for Crowd Localization
MIT License
169 stars 41 forks source link

HRNet vs. VGG-16 #29

Closed pietz closed 1 year ago

pietz commented 2 years ago

Thanks for your work and sharing the code. It seems like for all your experiments you're using the HRNet architecture, which is a much more advanced model compared to VGG-16 that is used in most other works. From my perspective it's hard to judge how much improvement comes from the loss function you introduce and how much comes from the backbone alone.

What are your thoughts on this? Did you also run experiments with a VGG-16 backbone?

Thanks, Paul

dk-liang commented 1 year ago

Yes, please see the latest version