weiliu89 / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
4.77k stars 1.67k forks source link

python tool to generate ssd networks prototxt #92

Open jay-mahadeokar opened 8 years ago

jay-mahadeokar commented 8 years ago

@weiliu89 Thanks for wonderful and easy to use framework for object detection. I have released pynetbuilder tool for generating popular caffe network files. It also has a implementation for resnet + ssd, I could get 70.4% mAP on VOC2007 with resnet as base network. If you think this tool can be useful, can we add pointer somewhere in your code's Readme? Any feedback welcome!

kristellmarisse commented 8 years ago

Thank you! I was looking to train and test resnet +SSD.

forever3000 commented 8 years ago

Thanks, However I always get out of memory with ResNet-50 model on my PC with 4xGTX980. So what is the minimum memory for RESNET training ?

kristellmarisse commented 8 years ago

I am using GTX 960 (4gb). I am able to start the training after reducing the batch size to 2, instead of 8 in train.prototxt and test.prototxt. I am also finetuning the PASCALVOC dataset from resnet50 caffemodel trained on imagenet. The training loss does not seem to reduce after some time.

forever3000 commented 8 years ago

Hi @kristellmarisse , Can you show me the python script for training and testing with RESNET model ? Thanks,

kristellmarisse commented 8 years ago

I am not using any python script to start the training. Instead I downloaded the prototxts from pynetbuilder page and started the training by running the following command from SSD caffe root:

./build/tools/caffe train -solver path-to-solver/solver.prototxt -weights path-to-pretrained-model/ResNet-50-model.caffemodel -gpu 0

I am using only one GPU for training and the pre-trained model is from this page:

https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777

I am not sure if I am doing it right since the training doesn't seem to converge after some point.

forever3000 commented 8 years ago

Hi @kristellmarisse , do you think it's SSD or just simple Caffe?

kristellmarisse commented 8 years ago

@forever3000 : It is definitely SSD since I don't have simple caffe installed in my PC.

forever3000 commented 8 years ago

Hi @kristellmarisse ,

Yes I will test it. Please update if you get the better.

Thanks

forever3000 commented 8 years ago

@kristellmarisse , Did you try to run detection after training with RESNET. I tried but got failed. I'm using examples/ssd_detect.ipynb to detect on each image. It announced that : Unknown bottom blob 'label'. Do you have any idea? Thanks

kristellmarisse commented 8 years ago

can you share me your deploy.prototxt for RESNET? What was your training loss at the end and did you finetune from Imagenet model or trained from scratch

forever3000 commented 8 years ago

I finetune from resnet-50 model in this link https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777 This is my result after about 40k iterations : Train net output #0: mbox_loss = 0.613243 deploy.prototxt is attached, I'm not sure everything is setup correct. deploy.prototxt.txt

kristellmarisse commented 8 years ago

Thanks for the deploy.prototxt. The last layer name is 'detection_out', not 'label'. So change your python script accordingly to deploy the trained model. And did you use multiple GPUs to train and What was your batch size?

forever3000 commented 8 years ago

Yes, I'm using 4xGTX980 for training, however GTX980 have only 4GB memory so batch size is also set to 2 instead of 8.

kristellmarisse commented 8 years ago

I am using the same method to train , except for the number of GPUs used. So does GPU count affect the training convergence?

weiliu89 commented 8 years ago

@jay-mahadeokar Thanks for providing the link. I will add it to the README.md later.

@kristellmarisse @forever3000 I have uploaded script to train with ResNet101. You could modify batch_size to smaller number and keep accum_batch_size to 32 from here. This might lead to inferior results because it will affect the batch_norm statistics.

jay-mahadeokar commented 8 years ago

@weiliu89 thanks! Wow, did you finish training SSD on ResNet101 provided here? Can you please share what mAP you got on voc2007? Best I could get with ResNet50 is 70.4, though I used my own pretrained network which doesnt match the accuracy in resnet paper(71.8% vs 75% top1 as reported in paper)

weiliu89 commented 8 years ago

@jay-mahadeokar You can check 0b3406b. It is about 72.9 mAP with ResNet101. I am not sure if my current setting is the optimal though. I personally don't see big advantage of ResNet over VGGNet. But I do observe that ResNet converges faster though.

jay-mahadeokar commented 8 years ago

@weiliu89 thanks! Your correct, the mAP improvements is marginal. Only advantage I can see is resnet_50 is much faster in terms of flops vs vggnet refer this table, though I am still not able to match vgg mAP with resnet_50, not sure the setting is optimal!

forever3000 commented 8 years ago

hi @jay-mahadeokar , As this table, It mean detection time of resnet_50 is faster than reduced VGG Net (SSD paper) ? Thanks

jay-mahadeokar commented 8 years ago

I believe so, I haven't benchmarked runtime on CPU though. FYI, I am using this code to compute the flops.

forever3000 commented 8 years ago

@jay-mahadeokar thanks,

I also modified @weiliu89 's script to support resnet_50 model. However, I got this error when I try to fine-tuning from the RESNET-50 pre-trained model in this link: https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777

Error : Check failed: target_blobs.size() == source_layer.blobs_size() (1 vs. 2) Incompatible number of blobs for layer conv1

kyodaisuki commented 8 years ago

if you close the Batch normal of Conv1 in resnet_50 model, you will solve the error I don't understand what is going on

forever3000 commented 8 years ago

@kyodaisuki Thanks, It worked like a charm :D.

kristellmarisse commented 8 years ago

I have trained VOC using resnet101. Since I trained this using 8x aws gpu, I trained using smaller batch size (accum batch size was still 32) . I want to compare the performance with the voc trained with default batch size settings. Can any one share the caffemodel so that I can compare the performance?

forever3000 commented 8 years ago

@kristellmarisse : can you share your result with smaller batch size. At my side, It look very hard to converge with 4 x GTX980 and batch size is 2.

Anfield-Uncle commented 7 years ago

@kristellmarisse I trained the resnet50 as you do , the loss is declining .but when test the net, it can not detect anything. It is stranged. And the data is just VOC07+12. I just used the train(test).prototxt ,solver.prototxt and the pretrain model online. Do anybody have any advice?Thanks.

HolmesShuan commented 5 years ago

@jay-mahadeokar @kristellmarisse @forever3000 @Anfield-Uncle

in python/caffe/model_libs.py

def ResNet50Body(net,from_layer, use_pool5=True, use_dilation_conv5=False): conv_prefix = '' conv_postfix = '' bnprefix = 'bn' bn_postfix = '' scaleprefix = 'scale' scale_postfix = '' ConvBNLayer(net, from_layer, 'conv1', use_bn=True, use_relu=True, num_output=64, kernel_size=7, pad=3, stride=2, conv_prefix=conv_prefix, conv_postfix=conv_postfix, bn_prefix=bn_prefix, bn_postfix=bn_postfix, scale_prefix=scale_prefix, scale_postfix=scale_postfix, bias_term=True)

net.pool1 = L.Pooling(net.conv1, pool=P.Pooling.MAX, kernel_size=3, stride=2)
ResBody(net, 'pool1', '2a', out2a=64, out2b=64, out2c=256, stride=1, use_branch1=True)

ResBody(net, 'res2a', '2b', out2a=64, out2b=64, out2c=256, stride=1, use_branch1=False)
ResBody(net, 'res2b', '2c', out2a=64, out2b=64, out2c=256, stride=1, use_branch1=False)

ResBody(net, 'res2c', '3a', out2a=128, out2b=128, out2c=512, stride=2, use_branch1=True)

ResBody(net, 'res3a', '3b', out2a=128, out2b=128, out2c=512, stride=1, use_branch1=False)
ResBody(net, 'res3b', '3c', out2a=128, out2b=128, out2c=512, stride=1, use_branch1=False)
ResBody(net, 'res3c', '3d', out2a=128, out2b=128, out2c=512, stride=1, use_branch1=False)

ResBody(net, 'res3d', '4a', out2a=256, out2b=256, out2c=1024, stride=2, use_branch1=True)

ResBody(net, 'res4a', '4b', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4b', '4c', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4c', '4d', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4d', '4e', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4e', '4f', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)

from_layer = 'res4f'
stride = 2
dilation = 1
if use_dilation_conv5:
    stride = 1
    dilation = 2

ResBody(net, from_layer, '5a', out2a=512, out2b=512, out2c=2048, stride=stride, use_branch1=True, dilation=dilation)
ResBody(net, 'res5a', '5b', out2a=512, out2b=512, out2c=2048, stride=1, use_branch1=False, dilation=dilation)
ResBody(net, 'res5b', '5c', out2a=512, out2b=512, out2c=2048, stride=1, use_branch1=False, dilation=dilation)

if use_pool5:
    net.pool5 = L.Pooling(net.res5c, pool=P.Pooling.AVE, global_pooling=True)

return net