Open jay-mahadeokar opened 8 years ago
Thank you! I was looking to train and test resnet +SSD.
Thanks, However I always get out of memory with ResNet-50 model on my PC with 4xGTX980. So what is the minimum memory for RESNET training ?
I am using GTX 960 (4gb). I am able to start the training after reducing the batch size to 2, instead of 8 in train.prototxt and test.prototxt. I am also finetuning the PASCALVOC dataset from resnet50 caffemodel trained on imagenet. The training loss does not seem to reduce after some time.
Hi @kristellmarisse , Can you show me the python script for training and testing with RESNET model ? Thanks,
I am not using any python script to start the training. Instead I downloaded the prototxts from pynetbuilder page and started the training by running the following command from SSD caffe root:
./build/tools/caffe train -solver path-to-solver/solver.prototxt -weights path-to-pretrained-model/ResNet-50-model.caffemodel -gpu 0
I am using only one GPU for training and the pre-trained model is from this page:
I am not sure if I am doing it right since the training doesn't seem to converge after some point.
Hi @kristellmarisse , do you think it's SSD or just simple Caffe?
@forever3000 : It is definitely SSD since I don't have simple caffe installed in my PC.
Hi @kristellmarisse ,
Yes I will test it. Please update if you get the better.
Thanks
@kristellmarisse , Did you try to run detection after training with RESNET. I tried but got failed. I'm using examples/ssd_detect.ipynb to detect on each image. It announced that : Unknown bottom blob 'label'. Do you have any idea? Thanks
can you share me your deploy.prototxt for RESNET? What was your training loss at the end and did you finetune from Imagenet model or trained from scratch
I finetune from resnet-50 model in this link https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777 This is my result after about 40k iterations : Train net output #0: mbox_loss = 0.613243 deploy.prototxt is attached, I'm not sure everything is setup correct. deploy.prototxt.txt
Thanks for the deploy.prototxt. The last layer name is 'detection_out', not 'label'. So change your python script accordingly to deploy the trained model. And did you use multiple GPUs to train and What was your batch size?
Yes, I'm using 4xGTX980 for training, however GTX980 have only 4GB memory so batch size is also set to 2 instead of 8.
I am using the same method to train , except for the number of GPUs used. So does GPU count affect the training convergence?
@jay-mahadeokar Thanks for providing the link. I will add it to the README.md later.
@kristellmarisse @forever3000 I have uploaded script to train with ResNet101. You could modify batch_size to smaller number and keep accum_batch_size to 32 from here. This might lead to inferior results because it will affect the batch_norm statistics.
@weiliu89 thanks! Wow, did you finish training SSD on ResNet101 provided here? Can you please share what mAP you got on voc2007? Best I could get with ResNet50 is 70.4, though I used my own pretrained network which doesnt match the accuracy in resnet paper(71.8% vs 75% top1 as reported in paper)
@jay-mahadeokar You can check 0b3406b. It is about 72.9 mAP with ResNet101. I am not sure if my current setting is the optimal though. I personally don't see big advantage of ResNet over VGGNet. But I do observe that ResNet converges faster though.
@weiliu89 thanks! Your correct, the mAP improvements is marginal. Only advantage I can see is resnet_50 is much faster in terms of flops vs vggnet refer this table, though I am still not able to match vgg mAP with resnet_50, not sure the setting is optimal!
hi @jay-mahadeokar , As this table, It mean detection time of resnet_50 is faster than reduced VGG Net (SSD paper) ? Thanks
I believe so, I haven't benchmarked runtime on CPU though. FYI, I am using this code to compute the flops.
@jay-mahadeokar thanks,
I also modified @weiliu89 's script to support resnet_50 model. However, I got this error when I try to fine-tuning from the RESNET-50 pre-trained model in this link: https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777
Error : Check failed: target_blobs.size() == source_layer.blobs_size() (1 vs. 2) Incompatible number of blobs for layer conv1
if you close the Batch normal of Conv1 in resnet_50 model, you will solve the error I don't understand what is going on
@kyodaisuki Thanks, It worked like a charm :D.
I have trained VOC using resnet101. Since I trained this using 8x aws gpu, I trained using smaller batch size (accum batch size was still 32) . I want to compare the performance with the voc trained with default batch size settings. Can any one share the caffemodel so that I can compare the performance?
@kristellmarisse : can you share your result with smaller batch size. At my side, It look very hard to converge with 4 x GTX980 and batch size is 2.
@kristellmarisse I trained the resnet50 as you do , the loss is declining .but when test the net, it can not detect anything. It is stranged. And the data is just VOC07+12. I just used the train(test).prototxt ,solver.prototxt and the pretrain model online. Do anybody have any advice?Thanks.
@jay-mahadeokar @kristellmarisse @forever3000 @Anfield-Uncle
AnnotatedData
layer will significantly improve the performance of ResNet-SSD (~4 points, our ResNet50-SSD reaches ~75 mAP@0.5).
distort_param {
brightness_prob: 0.5
brightness_delta: 32.0
contrast_prob: 0.5
contrast_lower: 0.5
contrast_upper: 1.5
hue_prob: 0.5
hue_delta: 18.0
saturation_prob: 0.5
saturation_lower: 0.5
saturation_upper: 1.5
random_order_prob: 0.0
}
mbox_source_layers
as follow:
mbox_source_layers = ['res3d_relu', 'res4f_relu', 'res5c_relu/conv1_2', 'res5c_relu/conv2_2', 'res5c_relu/conv3_2', 'res5c_relu/conv4_2']
# in ssd_pascal_resnet.py
ResNet50Body(net, from_layer='data', use_pool5=False, use_dilation_conv5=False)
def ResNet50Body(net,from_layer, use_pool5=True, use_dilation_conv5=False): conv_prefix = '' conv_postfix = '' bnprefix = 'bn' bn_postfix = '' scaleprefix = 'scale' scale_postfix = '' ConvBNLayer(net, from_layer, 'conv1', use_bn=True, use_relu=True, num_output=64, kernel_size=7, pad=3, stride=2, conv_prefix=conv_prefix, conv_postfix=conv_postfix, bn_prefix=bn_prefix, bn_postfix=bn_postfix, scale_prefix=scale_prefix, scale_postfix=scale_postfix, bias_term=True)
net.pool1 = L.Pooling(net.conv1, pool=P.Pooling.MAX, kernel_size=3, stride=2)
ResBody(net, 'pool1', '2a', out2a=64, out2b=64, out2c=256, stride=1, use_branch1=True)
ResBody(net, 'res2a', '2b', out2a=64, out2b=64, out2c=256, stride=1, use_branch1=False)
ResBody(net, 'res2b', '2c', out2a=64, out2b=64, out2c=256, stride=1, use_branch1=False)
ResBody(net, 'res2c', '3a', out2a=128, out2b=128, out2c=512, stride=2, use_branch1=True)
ResBody(net, 'res3a', '3b', out2a=128, out2b=128, out2c=512, stride=1, use_branch1=False)
ResBody(net, 'res3b', '3c', out2a=128, out2b=128, out2c=512, stride=1, use_branch1=False)
ResBody(net, 'res3c', '3d', out2a=128, out2b=128, out2c=512, stride=1, use_branch1=False)
ResBody(net, 'res3d', '4a', out2a=256, out2b=256, out2c=1024, stride=2, use_branch1=True)
ResBody(net, 'res4a', '4b', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4b', '4c', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4c', '4d', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4d', '4e', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
ResBody(net, 'res4e', '4f', out2a=256, out2b=256, out2c=1024, stride=1, use_branch1=False)
from_layer = 'res4f'
stride = 2
dilation = 1
if use_dilation_conv5:
stride = 1
dilation = 2
ResBody(net, from_layer, '5a', out2a=512, out2b=512, out2c=2048, stride=stride, use_branch1=True, dilation=dilation)
ResBody(net, 'res5a', '5b', out2a=512, out2b=512, out2c=2048, stride=1, use_branch1=False, dilation=dilation)
ResBody(net, 'res5b', '5c', out2a=512, out2b=512, out2c=2048, stride=1, use_branch1=False, dilation=dilation)
if use_pool5:
net.pool5 = L.Pooling(net.res5c, pool=P.Pooling.AVE, global_pooling=True)
return net
@weiliu89 Thanks for wonderful and easy to use framework for object detection. I have released pynetbuilder tool for generating popular caffe network files. It also has a implementation for resnet + ssd, I could get 70.4% mAP on VOC2007 with resnet as base network. If you think this tool can be useful, can we add pointer somewhere in your code's Readme? Any feedback welcome!