cvlab-stonybrook / LearningToCountEverything

MIT License
361 stars 75 forks source link

Using pretrained MobileNet V3 as backbone #19

Closed ajkailash closed 2 years ago

ajkailash commented 2 years ago

Hey,

The problem of counting objects of interest in everyday scenes seems to be a mobile vision application requirement (low compute/high speed constraints). Did you consider using MobileNet V3 as the backbone of your network during training instead of using Resnet50? If yes, what layers in the MobileNet V3 blocks did you use to generate the feature maps. Also, how well did it perform in terms of MAE during validation?

Thanks for the answers in advance.

Regards, Lakshmi Narayan

Viresh-R commented 2 years ago

Hey, we did not try Mobilenet. But we did experiment with a quantized version of Resnet-50 on an android device. This does result in speedup, however, leads to a drop in performance. The Val MAE drops to ~38.5 for this quantized version.