zhreshold / mxnet-ssd

MXNet port of SSD: Single Shot MultiBox Object Detector. Reimplementation of https://github.com/weiliu89/caffe/tree/ssd
MIT License
763 stars 336 forks source link

Ask : Inception v3 #58

Closed edmBernard closed 7 years ago

edmBernard commented 7 years ago

Have you test SSD with other NN for feature extraction like Inception v3 / Resnet ?

zhreshold commented 7 years ago

Not yet, though I'm planning to. Recently I'm stuck with other stuffs. You are welcome to submit PR if you like

edmBernard commented 7 years ago

Just a question, do you prefer a PR on this repos or on the official mxnet/example/ssd ?

zhreshold commented 7 years ago

Either one is good. I'm rewriting the framework, so later I'll submit PR to official one, don't worry about it.

edmBernard commented 7 years ago

ok I start from mxnet/example/ssd. until now, I just

I use this command to train :

export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 && python2 train.py --gpus 0 --batch-size 16 --network inception_v3 --pretrained model/Inception-7 --lr 0.002

I train with a 1070. I'll try to tune a bit parameters. I accidentally remove the 2 convolutions layer between VGG16 and SSD layer. I need to retest with them. I will update.

my modification are here

edmBernard commented 7 years ago

I got this result on Pascal VOC after 720 epoch :

inception_v3 after 720 epoch vgg16_reduced your pretrained version
AP for aeroplane = 0.6638 AP for aeroplane = 0.7215
AP for bicycle = 0.7589 AP for bicycle = 0.7969
AP for bird = 0.6429 AP for bird = 0.7020
AP for boat = 0.5778 AP for boat = 0.6523
AP for bottle = 0.3163 AP for bottle = 0.4272
AP for bus = 0.7721 AP for bus = 0.7791
AP for car = 0.7475 AP for car = 0.8176
AP for cat = 0.8105 AP for cat = 0.8579
AP for chair = 0.4798 AP for chair = 0.5231
AP for cow = 0.6568 AP for cow = 0.7472
AP for diningtable = 0.6972 AP for diningtable = 0.7044
AP for dog = 0.7795 AP for dog = 0.8346
AP for horse = 0.8008 AP for horse = 0.8026
AP for motorbike = 0.7677 AP for motorbike = 0.7706
AP for person = 0.6867 AP for person = 0.7441
AP for pottedplant = 0.4370 AP for pottedplant = 0.4651
AP for sheep = 0.6546 AP for sheep = 0.7340
AP for sofa = 0.7062 AP for sofa = 0.7148
AP for train = 0.8344 AP for train = 0.8400
AP for tvmonitor = 0.6523 AP for tvmonitor = 0.6798
Mean AP = 0.6721 Mean AP = 0.7157

Both are train on trainval data. evaluation with this command :

export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 && python3 evaluate.py --network vgg16_reduced --batch-size 16 --epoch 0
export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 && python3 evaluate.py --network inception_v3 --batch-size 16 --epoch 720
zhreshold commented 7 years ago

I cannot say if it's network structure or training hyper-parameters that affected the performance. But this is definitely not optimal. I will join you later.

edmBernard commented 7 years ago

I'm agree with you it's definitely not optimal Until now, I test 2 structures :

I didn't have the ressource to test lot's of hyper-parameters. So I just search the convergence, I don't optimize them and I don't finetune yet. In my memory, your repos converge easier and faster than mxnet/example/ssd. I'll try with it soon.

edmBernard commented 7 years ago

with the ssd version on your repos I got almost the same result :

inception_v3 vgg16_reduced
AP for aeroplane = 0.6528 AP for aeroplane = 0.7215
AP for bicycle = 0.7551 AP for bicycle = 0.7969
AP for bird = 0.6382 AP for bird = 0.7020
AP for boat = 0.5647 AP for boat = 0.6523
AP for bottle = 0.3018 AP for bottle = 0.4272
AP for bus = 0.7858 AP for bus = 0.7791
AP for car = 0.7162 AP for car = 0.8176
AP for cat = 0.8032 AP for cat = 0.8579
AP for chair = 0.4547 AP for chair = 0.5231
AP for cow = 0.6860 AP for cow = 0.7472
AP for diningtable = 0.6667 AP for diningtable = 0.7044
AP for dog = 0.7877 AP for dog = 0.8346
AP for horse = 0.8004 AP for horse = 0.8026
AP for motorbike = 0.7471 AP for motorbike = 0.7706
AP for person = 0.6670 AP for person = 0.7441
AP for pottedplant = 0.4014 AP for pottedplant = 0.4651
AP for sheep = 0.6334 AP for sheep = 0.7340
AP for sofa = 0.6768 AP for sofa = 0.7148
AP for train = 0.8216 AP for train = 0.8400
AP for tvmonitor = 0.6665 AP for tvmonitor = 0.6798
Mean AP = 0.6614 Mean AP = 0.7157

This time I try to finetune and adjust some hyperparameters. But I don't have better result. Next, I will try to adjust layers in inception structure. Like conv layer between ssd and inception etc ...

zhreshold commented 7 years ago

Okay, thanks for the updates

edmBernard commented 7 years ago

I train inception v3 with 2 intermediate layer between inception and ssd part.

class inception_v3
with intermediate
conv layer
inception_v3
without intermediate
conv layer
vgg16_reduced
AP for aeroplane 0.6580 0.6638 0.7215
AP for bicycle 0.7630 0.7589 0.7969
AP for bird 0.6454 0.6429 0.7020
AP for boat 0.5489 0.5778 0.6523
AP for bottle 0.2983 0.3163 0.4272
AP for bus 0.7682 0.7721 0.7791
AP for car 0.7437 0.7475 0.8176
AP for cat 0.7973 0.8105 0.8579
AP for chair 0.4805 0.4798 0.5231
AP for cow 0.6664 0.6568 0.7472
AP for diningtable 0.6840 0.6972 0.7044
AP for dog 0.7780 0.7795 0.8346
AP for horse 0.8008 0.8008 0.8026
AP for motorbike 0.7704 0.7677 0.7706
AP for person 0.6708 0.6867 0.7441
AP for pottedplant 0.4142 0.4370 0.4651
AP for sheep 0.6536 0.6546 0.7340
AP for sofa 0.6866 0.7062 0.7148
AP for train 0.8338 0.8344 0.8400
AP for tvmonitor 0.6778 0.6523 0.6798
Mean AP 0.6670 0.6721 0.7157

I try to adjust hyper parameters, finetuning but I don't got better result than vgg16.

zhreshold commented 7 years ago

I am debugging into very finer details these days, and a new release is on the way after that. Let's see if it helps.

edmBernard commented 7 years ago

Do you think training can be better with larger batch size ? I still continue to adjust some hyperparameters.

zhreshold commented 7 years ago

I guess not, during my earlier test, I found larger batch size is not even better. But I cannot assure that.

zhreshold commented 7 years ago

I've updated the master branch, would you like to try out? It is improving the mAP by 5-6% on the vgg model.

edmBernard commented 7 years ago

Cool I'll test it as soon as possible maybe tomorrow.

agshift commented 7 years ago

@edmBernard, Can you share your github where you implemented SSD with inception-v3? Would like to try it as well.

edmBernard commented 7 years ago

My version of SSD with inception v3 is here : https://github.com/edmBernard/mxnet/tree/master/example/ssd I don't have time to work on it recently and the results are not so good.

edmBernard commented 7 years ago

@zhreshold I close the issue. You just add Inception and Resnet in your code Thanks for your help