Ask : Inception v3 - Githubissues

edmBernard commented 7 years ago

Have you test SSD with other NN for feature extraction like Inception v3 / Resnet ?

zhreshold commented 7 years ago

Not yet, though I'm planning to. Recently I'm stuck with other stuffs. You are welcome to submit PR if you like

edmBernard commented 7 years ago

Just a question, do you prefer a PR on this repos or on the official mxnet/example/ssd ?

zhreshold commented 7 years ago

Either one is good. I'm rewriting the framework, so later I'll submit PR to official one, don't worry about it.

edmBernard commented 7 years ago

ok I start from mxnet/example/ssd. until now, I just

copy/past inception from classification
use the pretrained model from mxnet gallery

made some naive connection between inception and ssd. and it seem to begin to converge :

INFO:root:Epoch[14] Train-Acc=0.840990
INFO:root:Epoch[14] Train-ObjectAcc=0.464270
INFO:root:Epoch[14] Train-SmoothL1=14.017799
INFO:root:Epoch[14] Time cost=399.309

I use this command to train :

export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 && python2 train.py --gpus 0 --batch-size 16 --network inception_v3 --pretrained model/Inception-7 --lr 0.002

I train with a 1070. I'll try to tune a bit parameters. I accidentally remove the 2 convolutions layer between VGG16 and SSD layer. I need to retest with them. I will update.

my modification are here

edmBernard commented 7 years ago

I got this result on Pascal VOC after 720 epoch :

inception_v3 after 720 epoch	vgg16_reduced your pretrained version
AP for aeroplane = 0.6638	AP for aeroplane = 0.7215
AP for bicycle = 0.7589	AP for bicycle = 0.7969
AP for bird = 0.6429	AP for bird = 0.7020
AP for boat = 0.5778	AP for boat = 0.6523
AP for bottle = 0.3163	AP for bottle = 0.4272
AP for bus = 0.7721	AP for bus = 0.7791
AP for car = 0.7475	AP for car = 0.8176
AP for cat = 0.8105	AP for cat = 0.8579
AP for chair = 0.4798	AP for chair = 0.5231
AP for cow = 0.6568	AP for cow = 0.7472
AP for diningtable = 0.6972	AP for diningtable = 0.7044
AP for dog = 0.7795	AP for dog = 0.8346
AP for horse = 0.8008	AP for horse = 0.8026
AP for motorbike = 0.7677	AP for motorbike = 0.7706
AP for person = 0.6867	AP for person = 0.7441
AP for pottedplant = 0.4370	AP for pottedplant = 0.4651
AP for sheep = 0.6546	AP for sheep = 0.7340
AP for sofa = 0.7062	AP for sofa = 0.7148
AP for train = 0.8344	AP for train = 0.8400
AP for tvmonitor = 0.6523	AP for tvmonitor = 0.6798
Mean AP = 0.6721	Mean AP = 0.7157

Both are train on trainval data. evaluation with this command :

export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 && python3 evaluate.py --network vgg16_reduced --batch-size 16 --epoch 0
export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 && python3 evaluate.py --network inception_v3 --batch-size 16 --epoch 720

zhreshold commented 7 years ago

I cannot say if it's network structure or training hyper-parameters that affected the performance. But this is definitely not optimal. I will join you later.

edmBernard commented 7 years ago

I'm agree with you it's definitely not optimal Until now, I test 2 structures :

Inception layer + SSD layer : this network give the previous results
Inception layer + 2 conv layer + SSD layer : this network don't converge

I didn't have the ressource to test lot's of hyper-parameters. So I just search the convergence, I don't optimize them and I don't finetune yet. In my memory, your repos converge easier and faster than mxnet/example/ssd. I'll try with it soon.

edmBernard commented 7 years ago

with the ssd version on your repos I got almost the same result :

inception_v3	vgg16_reduced
AP for aeroplane = 0.6528	AP for aeroplane = 0.7215
AP for bicycle = 0.7551	AP for bicycle = 0.7969
AP for bird = 0.6382	AP for bird = 0.7020
AP for boat = 0.5647	AP for boat = 0.6523
AP for bottle = 0.3018	AP for bottle = 0.4272
AP for bus = 0.7858	AP for bus = 0.7791
AP for car = 0.7162	AP for car = 0.8176
AP for cat = 0.8032	AP for cat = 0.8579
AP for chair = 0.4547	AP for chair = 0.5231
AP for cow = 0.6860	AP for cow = 0.7472
AP for diningtable = 0.6667	AP for diningtable = 0.7044
AP for dog = 0.7877	AP for dog = 0.8346
AP for horse = 0.8004	AP for horse = 0.8026
AP for motorbike = 0.7471	AP for motorbike = 0.7706
AP for person = 0.6670	AP for person = 0.7441
AP for pottedplant = 0.4014	AP for pottedplant = 0.4651
AP for sheep = 0.6334	AP for sheep = 0.7340
AP for sofa = 0.6768	AP for sofa = 0.7148
AP for train = 0.8216	AP for train = 0.8400
AP for tvmonitor = 0.6665	AP for tvmonitor = 0.6798
Mean AP = 0.6614	Mean AP = 0.7157

This time I try to finetune and adjust some hyperparameters. But I don't have better result. Next, I will try to adjust layers in inception structure. Like conv layer between ssd and inception etc ...

zhreshold commented 7 years ago

Okay, thanks for the updates

edmBernard commented 7 years ago

I train inception v3 with 2 intermediate layer between inception and ssd part.

class	inception_v3 with intermediate conv layer	inception_v3 without intermediate conv layer	vgg16_reduced
AP for aeroplane	0.6580	0.6638	0.7215
AP for bicycle	0.7630	0.7589	0.7969
AP for bird	0.6454	0.6429	0.7020
AP for boat	0.5489	0.5778	0.6523
AP for bottle	0.2983	0.3163	0.4272
AP for bus	0.7682	0.7721	0.7791
AP for car	0.7437	0.7475	0.8176
AP for cat	0.7973	0.8105	0.8579
AP for chair	0.4805	0.4798	0.5231
AP for cow	0.6664	0.6568	0.7472
AP for diningtable	0.6840	0.6972	0.7044
AP for dog	0.7780	0.7795	0.8346
AP for horse	0.8008	0.8008	0.8026
AP for motorbike	0.7704	0.7677	0.7706
AP for person	0.6708	0.6867	0.7441
AP for pottedplant	0.4142	0.4370	0.4651
AP for sheep	0.6536	0.6546	0.7340
AP for sofa	0.6866	0.7062	0.7148
AP for train	0.8338	0.8344	0.8400
AP for tvmonitor	0.6778	0.6523	0.6798
Mean AP	0.6670	0.6721	0.7157

I try to adjust hyper parameters, finetuning but I don't got better result than vgg16.

zhreshold commented 7 years ago

I am debugging into very finer details these days, and a new release is on the way after that. Let's see if it helps.

edmBernard commented 7 years ago

Do you think training can be better with larger batch size ? I still continue to adjust some hyperparameters.

zhreshold commented 7 years ago

I guess not, during my earlier test, I found larger batch size is not even better. But I cannot assure that.

zhreshold commented 7 years ago

I've updated the master branch, would you like to try out? It is improving the mAP by 5-6% on the vgg model.

edmBernard commented 7 years ago

Cool I'll test it as soon as possible maybe tomorrow.

agshift commented 7 years ago

@edmBernard, Can you share your github where you implemented SSD with inception-v3? Would like to try it as well.

edmBernard commented 7 years ago

My version of SSD with inception v3 is here : https://github.com/edmBernard/mxnet/tree/master/example/ssd I don't have time to work on it recently and the results are not so good.

edmBernard commented 7 years ago

@zhreshold I close the issue. You just add Inception and Resnet in your code Thanks for your help

zhreshold / mxnet-ssd

Ask : Inception v3 #58