Validation accuracy for large model low, mistakes in model

xiaolai-sqlai / mobilenetv3

mobilenetv3 with pytorch，provide pre-train model

MIT License

1.6k stars 340 forks source link

Validation accuracy for large model low, mistakes in model #8

Open rwightman opened 5 years ago

rwightman commented 5 years ago

As with #5, the validation accuracy for the large model is also well below the stated. I was curious because the stated result, beating the official, with 1.4m less parameters would be impressive.

I only get: Prec@1 70.788 (29.212) Prec@5 89.410 (10.590)

Several things to fix in the model:

squeeze-excite layers should reduce the spatial dims with either a mean across spatial dim or avgpool. You have the avg pool in there but aren't using it.
should be no BN in SE module
SE module should be applied between the 3x3 DW conv and the 1x1 PWL, not after the PWL
as per paper, the reduction for the SE layer in mobilnet v3 should be applied to the expanded width
there were mistakes in the last block of 5x5 convs in the paper, those mistakes have been fixed with a new version, location of the last stride 2 changed and one of the 672 expansions should be 960
should be no batch norm after the linear before the classifier layer

xiaolai-sqlai commented 5 years ago

I add some tricks, some important tricks like warmup and cosine learning rate are really useful，besides, I use DALI bu Nvidia to load the model.

xiaolai-sqlai commented 5 years ago

I think the main cause is the dataloader, I will reproduce the model by dataloader in pytorch, instead of DALI.

JTzhuang commented 4 years ago

加载模型之后，第一个epoch的验证精度大幅度降低，从第二个epoch开始恢复正常，请问是什么原因。

rwightman commented 4 years ago

Revisiting this. Google finally released their official version of MobileNet-V3 a few weeks ago now. It confirmed the known issues mentioned here and several more: https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

I have also validated my own version of MobileNet-V3 in PyTorch. I trained from scratch back in May and reproduced the paper accuracy with a standard PyTorch data loader and preprocessing configuration. With the official Tensorflow release, I realized a few small differences and have updated mine to include a Tensorflow compatble version with weights from the official version. https://github.com/rwightman/gen-efficientnet-pytorch

TKONIY commented 4 years ago

Wow, thanks a lot. I've tried to find it in official repo but I only saw mobilenetv2 at that time. You and your repo are really great help.

rebeen commented 4 years ago

I have been looking at some implementation of mobilenetv3 but I have not seen the AutoML part in the codes, how does this work with mobilenetv3

rwightman commented 4 years ago

@rebeen I have not seen a full implementation of the Mobilenetv3 AutoML search (platform aware NAS (MnasNet) + NetAdapt) that would reproduce these networks. The platform aware NAS is a reinforcement learning based method, generally those are quite expensive to run, even with constraints on the architecture.

However, there are other search algorithms and bits and pieces out there that work with the same building blocks:

rebeen commented 4 years ago

@rwightman Thank you very much for your detailed explanation. I will check the links you have provided.