Hi, I follow your experiment setting, train VGG-SSD from scratch(no pretraining, no BN in the backbone, no BN in the head), use the following hyper param(caffe):
train batch size: 32, iter: 1, test batch size: 8
base_lr: 0.001
max_iter: 120000
type: SDG
weight_decay: 0.0005
gamma: 0.1
At 110k iterations, I get 68.8% mAP, It is higher than 67.6% mAP(paper reported).
Hi, I follow your experiment setting, train VGG-SSD from scratch(no pretraining, no BN in the backbone, no BN in the head), use the following hyper param(caffe):
At 110k iterations, I get 68.8% mAP, It is higher than 67.6% mAP(paper reported).