How to train rcnn for GoogLeNet

marutiagarwal commented 9 years ago

I have used the following proto file: https://github.com/BVLC/caffe/blob/master/models/bvlc_googlenet/train_val.prototxt

and create this train.prototxt for 33 output classes (including 1 for background). But as soon as I run the tools/train_net.py, I get the following error:

insert_splits.cpp:35] Unknown blob input label to layer 1

Can you please have a quick look at the modified train.prototxt provided below:

name: "GoogleNet" layer { name: 'data' type: 'Python' top: 'data' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_loss_weights' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 33" } } layer { name: "conv1/7x7_s2" type: "Convolution" bottom: "data" top: "conv1/7x7_s2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 3 kernel_size: 7 stride: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "conv1/relu_7x7" type: "ReLU" bottom: "conv1/7x7_s2" top: "conv1/7x7_s2" } layer { name: "pool1/3x3_s2" type: "Pooling" bottom: "conv1/7x7_s2" top: "pool1/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "pool1/norm1" type: "LRN" bottom: "pool1/3x3_s2" top: "pool1/norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv2/3x3_reduce" type: "Convolution" bottom: "pool1/norm1" top: "conv2/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "conv2/relu_3x3_reduce" type: "ReLU" bottom: "conv2/3x3_reduce" top: "conv2/3x3_reduce" } layer { name: "conv2/3x3" type: "Convolution" bottom: "conv2/3x3_reduce" top: "conv2/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "conv2/relu_3x3" type: "ReLU" bottom: "conv2/3x3" top: "conv2/3x3" } layer { name: "conv2/norm2" type: "LRN" bottom: "conv2/3x3" top: "conv2/norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2/3x3_s2" type: "Pooling" bottom: "conv2/norm2" top: "pool2/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "inception_3a/1x1" type: "Convolution" bottom: "pool2/3x3_s2" top: "inception_3a/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_1x1" type: "ReLU" bottom: "inception_3a/1x1" top: "inception_3a/1x1" } layer { name: "inception_3a/3x3_reduce" type: "Convolution" bottom: "pool2/3x3_s2" top: "inception_3a/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_3x3_reduce" type: "ReLU" bottom: "inception_3a/3x3_reduce" top: "inception_3a/3x3_reduce" } layer { name: "inception_3a/3x3" type: "Convolution" bottom: "inception_3a/3x3_reduce" top: "inception_3a/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_3x3" type: "ReLU" bottom: "inception_3a/3x3" top: "inception_3a/3x3" } layer { name: "inception_3a/5x5_reduce" type: "Convolution" bottom: "pool2/3x3_s2" top: "inception_3a/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 16 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_5x5_reduce" type: "ReLU" bottom: "inception_3a/5x5_reduce" top: "inception_3a/5x5_reduce" } layer { name: "inception_3a/5x5" type: "Convolution" bottom: "inception_3a/5x5_reduce" top: "inception_3a/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_5x5" type: "ReLU" bottom: "inception_3a/5x5" top: "inception_3a/5x5" } layer { name: "inception_3a/pool" type: "Pooling" bottom: "pool2/3x3_s2" top: "inception_3a/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_3a/pool_proj" type: "Convolution" bottom: "inception_3a/pool" top: "inception_3a/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_pool_proj" type: "ReLU" bottom: "inception_3a/pool_proj" top: "inception_3a/pool_proj" } layer { name: "inception_3a/output" type: "Concat" bottom: "inception_3a/1x1" bottom: "inception_3a/3x3" bottom: "inception_3a/5x5" bottom: "inception_3a/pool_proj" top: "inception_3a/output" } layer { name: "inception_3b/1x1" type: "Convolution" bottom: "inception_3a/output" top: "inception_3b/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_1x1" type: "ReLU" bottom: "inception_3b/1x1" top: "inception_3b/1x1" } layer { name: "inception_3b/3x3_reduce" type: "Convolution" bottom: "inception_3a/output" top: "inception_3b/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_3x3_reduce" type: "ReLU" bottom: "inception_3b/3x3_reduce" top: "inception_3b/3x3_reduce" } layer { name: "inception_3b/3x3" type: "Convolution" bottom: "inception_3b/3x3_reduce" top: "inception_3b/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_3x3" type: "ReLU" bottom: "inception_3b/3x3" top: "inception_3b/3x3" } layer { name: "inception_3b/5x5_reduce" type: "Convolution" bottom: "inception_3a/output" top: "inception_3b/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_5x5_reduce" type: "ReLU" bottom: "inception_3b/5x5_reduce" top: "inception_3b/5x5_reduce" } layer { name: "inception_3b/5x5" type: "Convolution" bottom: "inception_3b/5x5_reduce" top: "inception_3b/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_5x5" type: "ReLU" bottom: "inception_3b/5x5" top: "inception_3b/5x5" } layer { name: "inception_3b/pool" type: "Pooling" bottom: "inception_3a/output" top: "inception_3b/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_3b/pool_proj" type: "Convolution" bottom: "inception_3b/pool" top: "inception_3b/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_pool_proj" type: "ReLU" bottom: "inception_3b/pool_proj" top: "inception_3b/pool_proj" } layer { name: "inception_3b/output" type: "Concat" bottom: "inception_3b/1x1" bottom: "inception_3b/3x3" bottom: "inception_3b/5x5" bottom: "inception_3b/pool_proj" top: "inception_3b/output" } layer { name: "pool3/3x3_s2" type: "Pooling" bottom: "inception_3b/output" top: "pool3/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "inception_4a/1x1" type: "Convolution" bottom: "pool3/3x3_s2" top: "inception_4a/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_1x1" type: "ReLU" bottom: "inception_4a/1x1" top: "inception_4a/1x1" } layer { name: "inception_4a/3x3_reduce" type: "Convolution" bottom: "pool3/3x3_s2" top: "inception_4a/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_3x3_reduce" type: "ReLU" bottom: "inception_4a/3x3_reduce" top: "inception_4a/3x3_reduce" } layer { name: "inception_4a/3x3" type: "Convolution" bottom: "inception_4a/3x3_reduce" top: "inception_4a/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 208 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_3x3" type: "ReLU" bottom: "inception_4a/3x3" top: "inception_4a/3x3" } layer { name: "inception_4a/5x5_reduce" type: "Convolution" bottom: "pool3/3x3_s2" top: "inception_4a/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 16 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_5x5_reduce" type: "ReLU" bottom: "inception_4a/5x5_reduce" top: "inception_4a/5x5_reduce" } layer { name: "inception_4a/5x5" type: "Convolution" bottom: "inception_4a/5x5_reduce" top: "inception_4a/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 48 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_5x5" type: "ReLU" bottom: "inception_4a/5x5" top: "inception_4a/5x5" } layer { name: "inception_4a/pool" type: "Pooling" bottom: "pool3/3x3_s2" top: "inception_4a/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4a/pool_proj" type: "Convolution" bottom: "inception_4a/pool" top: "inception_4a/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_pool_proj" type: "ReLU" bottom: "inception_4a/pool_proj" top: "inception_4a/pool_proj" } layer { name: "inception_4a/output" type: "Concat" bottom: "inception_4a/1x1" bottom: "inception_4a/3x3" bottom: "inception_4a/5x5" bottom: "inception_4a/pool_proj" top: "inception_4a/output" } layer { name: "loss1/ave_pool" type: "Pooling" bottom: "inception_4a/output" top: "loss1/ave_pool" pooling_param { pool: AVE kernel_size: 5 stride: 3 } } layer { name: "loss1/conv" type: "Convolution" bottom: "loss1/ave_pool" top: "loss1/conv" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss1/relu_conv" type: "ReLU" bottom: "loss1/conv" top: "loss1/conv" } layer { name: "loss1/fc" type: "InnerProduct" bottom: "loss1/conv" top: "loss1/fc" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss1/relu_fc" type: "ReLU" bottom: "loss1/fc" top: "loss1/fc" } layer { name: "loss1/drop_fc" type: "Dropout" bottom: "loss1/fc" top: "loss1/fc" dropout_param { dropout_ratio: 0.7 } } layer { name: "loss1/classifier" type: "InnerProduct" bottom: "loss1/fc" top: "loss1/classifier" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1000 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "loss1/loss" type: "SoftmaxWithLoss" bottom: "loss1/classifier" bottom: "label" top: "loss1/loss1" loss_weight: 0.3 } layer { name: "loss1/top-1" type: "Accuracy" bottom: "loss1/classifier" bottom: "label" top: "loss1/top-1" include { phase: TEST } } layer { name: "loss1/top-5" type: "Accuracy" bottom: "loss1/classifier" bottom: "label" top: "loss1/top-5" include { phase: TEST } accuracy_param { top_k: 5 } } layer { name: "inception_4b/1x1" type: "Convolution" bottom: "inception_4a/output" top: "inception_4b/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 160 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_1x1" type: "ReLU" bottom: "inception_4b/1x1" top: "inception_4b/1x1" } layer { name: "inception_4b/3x3_reduce" type: "Convolution" bottom: "inception_4a/output" top: "inception_4b/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 112 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_3x3_reduce" type: "ReLU" bottom: "inception_4b/3x3_reduce" top: "inception_4b/3x3_reduce" } layer { name: "inception_4b/3x3" type: "Convolution" bottom: "inception_4b/3x3_reduce" top: "inception_4b/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 224 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_3x3" type: "ReLU" bottom: "inception_4b/3x3" top: "inception_4b/3x3" } layer { name: "inception_4b/5x5_reduce" type: "Convolution" bottom: "inception_4a/output" top: "inception_4b/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 24 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_5x5_reduce" type: "ReLU" bottom: "inception_4b/5x5_reduce" top: "inception_4b/5x5_reduce" } layer { name: "inception_4b/5x5" type: "Convolution" bottom: "inception_4b/5x5_reduce" top: "inception_4b/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_5x5" type: "ReLU" bottom: "inception_4b/5x5" top: "inception_4b/5x5" } layer { name: "inception_4b/pool" type: "Pooling" bottom: "inception_4a/output" top: "inception_4b/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4b/pool_proj" type: "Convolution" bottom: "inception_4b/pool" top: "inception_4b/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_pool_proj" type: "ReLU" bottom: "inception_4b/pool_proj" top: "inception_4b/pool_proj" } layer { name: "inception_4b/output" type: "Concat" bottom: "inception_4b/1x1" bottom: "inception_4b/3x3" bottom: "inception_4b/5x5" bottom: "inception_4b/pool_proj" top: "inception_4b/output" } layer { name: "inception_4c/1x1" type: "Convolution" bottom: "inception_4b/output" top: "inception_4c/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_1x1" type: "ReLU" bottom: "inception_4c/1x1" top: "inception_4c/1x1" } layer { name: "inception_4c/3x3_reduce" type: "Convolution" bottom: "inception_4b/output" top: "inception_4c/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_3x3_reduce" type: "ReLU" bottom: "inception_4c/3x3_reduce" top: "inception_4c/3x3_reduce" } layer { name: "inception_4c/3x3" type: "Convolution" bottom: "inception_4c/3x3_reduce" top: "inception_4c/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_3x3" type: "ReLU" bottom: "inception_4c/3x3" top: "inception_4c/3x3" } layer { name: "inception_4c/5x5_reduce" type: "Convolution" bottom: "inception_4b/output" top: "inception_4c/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 24 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_5x5_reduce" type: "ReLU" bottom: "inception_4c/5x5_reduce" top: "inception_4c/5x5_reduce" } layer { name: "inception_4c/5x5" type: "Convolution" bottom: "inception_4c/5x5_reduce" top: "inception_4c/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_5x5" type: "ReLU" bottom: "inception_4c/5x5" top: "inception_4c/5x5" } layer { name: "inception_4c/pool" type: "Pooling" bottom: "inception_4b/output" top: "inception_4c/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4c/pool_proj" type: "Convolution" bottom: "inception_4c/pool" top: "inception_4c/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_pool_proj" type: "ReLU" bottom: "inception_4c/pool_proj" top: "inception_4c/pool_proj" } layer { name: "inception_4c/output" type: "Concat" bottom: "inception_4c/1x1" bottom: "inception_4c/3x3" bottom: "inception_4c/5x5" bottom: "inception_4c/pool_proj" top: "inception_4c/output" } layer { name: "inception_4d/1x1" type: "Convolution" bottom: "inception_4c/output" top: "inception_4d/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 112 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_1x1" type: "ReLU" bottom: "inception_4d/1x1" top: "inception_4d/1x1" } layer { name: "inception_4d/3x3_reduce" type: "Convolution" bottom: "inception_4c/output" top: "inception_4d/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 144 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_3x3_reduce" type: "ReLU" bottom: "inception_4d/3x3_reduce" top: "inception_4d/3x3_reduce" } layer { name: "inception_4d/3x3" type: "Convolution" bottom: "inception_4d/3x3_reduce" top: "inception_4d/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 288 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_3x3" type: "ReLU" bottom: "inception_4d/3x3" top: "inception_4d/3x3" } layer { name: "inception_4d/5x5_reduce" type: "Convolution" bottom: "inception_4c/output" top: "inception_4d/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_5x5_reduce" type: "ReLU" bottom: "inception_4d/5x5_reduce" top: "inception_4d/5x5_reduce" } layer { name: "inception_4d/5x5" type: "Convolution" bottom: "inception_4d/5x5_reduce" top: "inception_4d/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_5x5" type: "ReLU" bottom: "inception_4d/5x5" top: "inception_4d/5x5" } layer { name: "inception_4d/pool" type: "Pooling" bottom: "inception_4c/output" top: "inception_4d/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4d/pool_proj" type: "Convolution" bottom: "inception_4d/pool" top: "inception_4d/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_pool_proj" type: "ReLU" bottom: "inception_4d/pool_proj" top: "inception_4d/pool_proj" } layer { name: "inception_4d/output" type: "Concat" bottom: "inception_4d/1x1" bottom: "inception_4d/3x3" bottom: "inception_4d/5x5" bottom: "inception_4d/pool_proj" top: "inception_4d/output" } layer { name: "loss2/ave_pool" type: "Pooling" bottom: "inception_4d/output" top: "loss2/ave_pool" pooling_param { pool: AVE kernel_size: 5 stride: 3 } } layer { name: "loss2/conv" type: "Convolution" bottom: "loss2/ave_pool" top: "loss2/conv" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss2/relu_conv" type: "ReLU" bottom: "loss2/conv" top: "loss2/conv" } layer { name: "loss2/fc" type: "InnerProduct" bottom: "loss2/conv" top: "loss2/fc" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss2/relu_fc" type: "ReLU" bottom: "loss2/fc" top: "loss2/fc" } layer { name: "loss2/drop_fc" type: "Dropout" bottom: "loss2/fc" top: "loss2/fc" dropout_param { dropout_ratio: 0.7 } } layer { name: "loss2/classifier" type: "InnerProduct" bottom: "loss2/fc" top: "loss2/classifier" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1000 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "loss2/loss" type: "SoftmaxWithLoss" bottom: "loss2/classifier" bottom: "label" top: "loss2/loss1" loss_weight: 0.3 } layer { name: "loss2/top-1" type: "Accuracy" bottom: "loss2/classifier" bottom: "label" top: "loss2/top-1" include { phase: TEST } } layer { name: "loss2/top-5" type: "Accuracy" bottom: "loss2/classifier" bottom: "label" top: "loss2/top-5" include { phase: TEST } accuracy_param { top_k: 5 } } layer { name: "inception_4e/1x1" type: "Convolution" bottom: "inception_4d/output" top: "inception_4e/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_1x1" type: "ReLU" bottom: "inception_4e/1x1" top: "inception_4e/1x1" } layer { name: "inception_4e/3x3_reduce" type: "Convolution" bottom: "inception_4d/output" top: "inception_4e/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 160 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_3x3_reduce" type: "ReLU" bottom: "inception_4e/3x3_reduce" top: "inception_4e/3x3_reduce" } layer { name: "inception_4e/3x3" type: "Convolution" bottom: "inception_4e/3x3_reduce" top: "inception_4e/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 320 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_3x3" type: "ReLU" bottom: "inception_4e/3x3" top: "inception_4e/3x3" } layer { name: "inception_4e/5x5_reduce" type: "Convolution" bottom: "inception_4d/output" top: "inception_4e/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_5x5_reduce" type: "ReLU" bottom: "inception_4e/5x5_reduce" top: "inception_4e/5x5_reduce" } layer { name: "inception_4e/5x5" type: "Convolution" bottom: "inception_4e/5x5_reduce" top: "inception_4e/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_5x5" type: "ReLU" bottom: "inception_4e/5x5" top: "inception_4e/5x5" } layer { name: "inception_4e/pool" type: "Pooling" bottom: "inception_4d/output" top: "inception_4e/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4e/pool_proj" type: "Convolution" bottom: "inception_4e/pool" top: "inception_4e/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_pool_proj" type: "ReLU" bottom: "inception_4e/pool_proj" top: "inception_4e/pool_proj" } layer { name: "inception_4e/output" type: "Concat" bottom: "inception_4e/1x1" bottom: "inception_4e/3x3" bottom: "inception_4e/5x5" bottom: "inception_4e/pool_proj" top: "inception_4e/output" } layer { name: "pool4/3x3_s2" type: "Pooling" bottom: "inception_4e/output" top: "pool4/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "inception_5a/1x1" type: "Convolution" bottom: "pool4/3x3_s2" top: "inception_5a/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_1x1" type: "ReLU" bottom: "inception_5a/1x1" top: "inception_5a/1x1" } layer { name: "inception_5a/3x3_reduce" type: "Convolution" bottom: "pool4/3x3_s2" top: "inception_5a/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 160 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_3x3_reduce" type: "ReLU" bottom: "inception_5a/3x3_reduce" top: "inception_5a/3x3_reduce" } layer { name: "inception_5a/3x3" type: "Convolution" bottom: "inception_5a/3x3_reduce" top: "inception_5a/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 320 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_3x3" type: "ReLU" bottom: "inception_5a/3x3" top: "inception_5a/3x3" } layer { name: "inception_5a/5x5_reduce" type: "Convolution" bottom: "pool4/3x3_s2" top: "inception_5a/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_5x5_reduce" type: "ReLU" bottom: "inception_5a/5x5_reduce" top: "inception_5a/5x5_reduce" } layer { name: "inception_5a/5x5" type: "Convolution" bottom: "inception_5a/5x5_reduce" top: "inception_5a/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_5x5" type: "ReLU" bottom: "inception_5a/5x5" top: "inception_5a/5x5" } layer { name: "inception_5a/pool" type: "Pooling" bottom: "pool4/3x3_s2" top: "inception_5a/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_5a/pool_proj" type: "Convolution" bottom: "inception_5a/pool" top: "inception_5a/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_pool_proj" type: "ReLU" bottom: "inception_5a/pool_proj" top: "inception_5a/pool_proj" } layer { name: "inception_5a/output" type: "Concat" bottom: "inception_5a/1x1" bottom: "inception_5a/3x3" bottom: "inception_5a/5x5" bottom: "inception_5a/pool_proj" top: "inception_5a/output" } layer { name: "inception_5b/1x1" type: "Convolution" bottom: "inception_5a/output" top: "inception_5b/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_1x1" type: "ReLU" bottom: "inception_5b/1x1" top: "inception_5b/1x1" } layer { name: "inception_5b/3x3_reduce" type: "Convolution" bottom: "inception_5a/output" top: "inception_5b/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_3x3_reduce" type: "ReLU" bottom: "inception_5b/3x3_reduce" top: "inception_5b/3x3_reduce" } layer { name: "inception_5b/3x3" type: "Convolution" bottom: "inception_5b/3x3_reduce" top: "inception_5b/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_3x3" type: "ReLU" bottom: "inception_5b/3x3" top: "inception_5b/3x3" } layer { name: "inception_5b/5x5_reduce" type: "Convolution" bottom: "inception_5a/output" top: "inception_5b/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 48 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_5x5_reduce" type: "ReLU" bottom: "inception_5b/5x5_reduce" top: "inception_5b/5x5_reduce" } layer { name: "inception_5b/5x5" type: "Convolution" bottom: "inception_5b/5x5_reduce" top: "inception_5b/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_5x5" type: "ReLU" bottom: "inception_5b/5x5" top: "inception_5b/5x5" } layer { name: "inception_5b/pool" type: "Pooling" bottom: "inception_5a/output" top: "inception_5b/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_5b/pool_proj" type: "Convolution" bottom: "inception_5b/pool" top: "inception_5b/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_pool_proj" type: "ReLU" bottom: "inception_5b/pool_proj" top: "inception_5b/pool_proj" } layer { name: "inception_5b/output" type: "Concat" bottom: "inception_5b/1x1" bottom: "inception_5b/3x3" bottom: "inception_5b/5x5" bottom: "inception_5b/pool_proj" top: "inception_5b/output" } layer { name: "pool5/7x7_s1" type: "Pooling" bottom: "inception_5b/output" top: "pool5/7x7_s1" pooling_param { pool: AVE kernel_size: 7 stride: 1 } } layer { name: "pool5/drop_7x7_s1" type: "Dropout" bottom: "pool5/7x7_s1" top: "pool5/7x7_s1" dropout_param { dropout_ratio: 0.4 } } layer { name: "cls_score" type: "InnerProduct" bottom: "pool5/7x7_s1" top: "cls_score" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 33 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred" type: "InnerProduct" bottom: "pool5/7x7_s1" top: "bbox_pred" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 132 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } layer { name: "loss_cls" type: "SoftmaxWithLoss" bottom: "cls_score" bottom: "labels" top: "loss_cls" loss_weight: 1 } layer { name: "loss_bbox" type: "SmoothL1Loss" bottom: "bbox_pred" bottom: "bbox_targets" bottom: "bbox_loss_weights" top: "loss_bbox" loss_weight: 1 }

grib0ed0v commented 9 years ago

@marutiagarwal actually here you don't use RoI Pooling layer. If you see the .prototxt for VGG or CaffeNet, they must contain this layer.

layer { name: "roi_pool5" type: "ROIPooling" bottom: "conv5_3" bottom: "rois" top: "pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } }

hermitman commented 9 years ago

@AlexGruzdev I have a question. Why the top connection is linked to pool5 instead of or_pool5? I don't see there is a pool5 layer in the prototxt file. Thank you very much~

grib0ed0v commented 9 years ago

@hermitman "pool5" - just a name of an output blob. The next layer have as bottom this "pool5" blob: ... layer { name: "inception_5b/output" type: "Concat" bottom: "inception_5b/1x1" bottom: "inception_5b/3x3" bottom: "inception_5b/5x5" bottom: "inception_5b/pool_proj" top: "inception_5b/output" } layer { name: "roi_pool5" type: "ROIPooling" bottom: "inception_5b/output" bottom: "rois" top: "pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } ...

catsdogone commented 9 years ago

layer { name: "conv1/7x7_s2" type: "Convolution" bottom: "data" top: "conv1/7x7_s2" param { lr_mult: 1 decay_mult: 1 } In /fast-rcnn/models/VGG16(and CaffeNet) /train.prototxt, the parameter lr_mult and decay_mult is 0, so why you set them 1. I still can not really understand the definition of the prototxt. Do you have any references? Thank you very much.

grib0ed0v commented 9 years ago

If you set lr_mult & decay_mult to 0 that actually means that you 'freeze' the layer and weights of this layer will not be updated during training.

jond55 commented 9 years ago

I'm trying to use this model also, but running into a problem with the input dimensions and the loss1/fc and loss2/fc InnerProduct layers. I'm not fully understanding how Fast-RCNN is able to use the model when the image input dimensions are variable, as they are. I think the input images are resized (not cropped) in lib/utils/blob.py so that the input dimensions are not 224x224 consistently. They may be 150x224. This doesn't seem to affect the CaffeNet/VGG models, but it is affecting GoogleNet, since it causes varying input sizes to the loss InnerProduct layers.

What is the best way to handle this? Should input images be resized to 224x224 to match the model, and lose the aspect ratio? Or should loss layers be modified? I tried changing loss1/fc to convolutional but it just pushed the problem a little further down. Or should these intermediate loss layers be removed?

Not a Caffe expert by any means, so apologize if these questions seem basic.

marutiagarwal commented 8 years ago

I got the googLeNet working for fast-rcnn. Though, performance isn't as good as that of VGG16. I am wondering (significantly lower true positive for ~100 classes), why would that be?

googlenet_test_caffe.prototxt.txt googlenet_train_caffe.prototxt.txt

catsdogone commented 8 years ago

@jond55 I think the input image doesn't need to be resized, the roi_pooling layer is where amazing happens. The input image can be any ratio but I think the minimum size should larger than 224.

catsdogone commented 8 years ago

@marutiagarwal roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } Is your spatial_scale compatible with the original model?

marutiagarwal commented 8 years ago

@catsdogone - No. I wasn't sure about its value. So used it as it is from VGG16. Rest everything looks ok. I haven't resized input images. Using their original size only. Although: input_shape { dim: 1 dim: 3 dim: 224 dim: 224 } is present in test.prototxt, to take a crop from input?

catsdogone commented 8 years ago

@marutiagarwal I think this value is determined by the stride of layers.

marutiagarwal commented 8 years ago

@catsdogone - it's value is 0.0625 for VGG16 and VGG_CNN_M. I don't think this should affect the performance by such a large margin. I am still missing on what is causing such poor performance for googlenet-fastrcnn.

marutiagarwal commented 8 years ago

@catsdogone - any thoughts on that?

201power commented 8 years ago

I got googlenet working on faster rcnn as well, on VOC2007 I got~65% mAP, not as good as VGG.

marutiagarwal commented 8 years ago

@201power - can you please upload your train, test, and solver proto files. May be we can help improving each others' system. Thanks !

bultina0 commented 8 years ago

@201power - Can you share your train prototxt files? Thanks~

RalphMao commented 8 years ago

@201power I am also wondering about it

marutiagarwal commented 8 years ago

@rbgirshick could you comment on this? Will highly appreciate that. Thanks.

catsdogone commented 8 years ago

@marutiagarwal I think the parameter spatial_scale: 0.0625 # 1/16 is very important, since it is the scaling parameter for coordinate projection. For VGG16, if the input images is (224,224), the conv5_3 is (14, 14), so the spatial_scale is 0.0625 #1/16. (224/14=16) I think your layer {name: "roi_pool3" ...} 's roi_pooling_param should be: roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.03125 # 1/32 } since the dim of inception_5b/output is (7,7) when the input image is (224,224)

But, I get a worse accuracy use this setting. May be I was wrong.

catsdogone commented 8 years ago

@201power Could you please share your parameters? Will highly appreciate that. Thanks.

jainanshul commented 8 years ago

@marutiagarwal or @201power would you guys be able to share your parameters?

marutiagarwal commented 8 years ago

@jainanshul - already shared my protofile, just few posts above.

XiongweiWu commented 8 years ago

I have modify the prototxt and get 63.8% mAP in pascal voc 2007. Mainly about the spatial ratio issue and the last pooling issue

catsdogone commented 8 years ago

@rbgirshick @marutiagarwal @XiongweiWu Ignore loss1 and loss2 layers, I just add rpn-layers after layer 《name: "inception_5b/output"》 and set "'feat_stride': 32" of layer 《name: 'rpn-data_faster3'>》and layer 《name: 'proposal_faster3'》,about the pooling layer, the parameters are(since there are five 2*2 max-pooling layers): layer { name: "roi_pool5_faster3" type: "ROIPooling" bottom: "inception_5b/output" bottom: "rois_faster3" top: "pool5_faster3" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.03125 # 1/32 } } But my precision was still lower than 10%. Is there something wrong? Thank you for your help.

This is my prototxt: test.txt train.txt

kamadforge commented 8 years ago

A note to anybody who would like to test the prototxt files from @catsdogone. The files assume 8 classes and they need to be edited accordingly if there is a different number of classes (e.g. 21 in Pascal VOC).

catsdogone commented 8 years ago

have you successfully trained a Googlenet/Inception network definition that works in conjunction with Fast R-CNN/faster RCNN? @kamadforge @marutiagarwal @jainanshul

marutiagarwal commented 8 years ago

yes ! Just that I got high fp. I think the problem might be coming from the part of having three fc classifiers instead of 1 as in most of the other networks. Since backprop is starting from all of these 3 fc classifier layers. ᐧ

Regards, Maruti Agarwal

On Tue, Jul 12, 2016 at 12:16 PM, catsdogone notifications@github.com wrote:

have you successfully trained a Googlenet/Inception network definition that works in conjunction with Fast R-CNN/faster RCNN? @kamadforge https://github.com/kamadforge @marutiagarwal https://github.com/marutiagarwal @jainanshul https://github.com/jainanshul

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/fast-rcnn/issues/36#issuecomment-231951659, or mute the thread https://github.com/notifications/unsubscribe/ACPuP6D6hHAJWA9ILr2D36VfQkXmGRSNks5qUzgvgaJpZM4FXUMo .

catsdogone commented 8 years ago

@marutiagarwal Could you share your parameter setting or experience in your training ？I have try many times with 1 or 3 fc classifiers while all failed, the ap value always just about 2%. Except the "spatial-scale" my net definition is similar with yours. Thank you very much.

Soccer-Video-Analysis commented 8 years ago

Why is the 1st blob

input: "data" input_shape { dim: 1 dim: 3 dim: 224 dim: 224 }

Why 224 x 224 ? I thought Fast RCNN take big input and propose region later ?

nw89 commented 7 years ago

I believe that it's rescaled for every image.

joyivan commented 7 years ago

@marutiagarwal can U share Ur prototxt files I need that for 1 class detecting

haoyul commented 7 years ago

@catsdogone Hi, I'm trying to train GoogleNet with rcnn and faced your same low AP problem by using your test/train prototxt. May I know how did you solve this? Many thanks!

haoyul commented 7 years ago

Hi @marutiagarwal, I modified the original googlenet prototxt and it's more similar to @catsdogone's version. Why there's no RPN and RoI proposal layers in yours? Anyone here trained successfully with @catsdogone's prototxt? My training also only get about 0.1 mean AP on pascal 2007. Thanks for any advise!

jkjung-avt commented 6 years ago

I might be late to the party. But anyway, I've trained a bvlc_googlenet based Faster RCNN model with Pascal VOC 2007 dataset. Its mAP is very close to the original VGG16 based Faster RCNN, and it's about twice faster.

I've shared my experience in my blog post. Have a look if you are interested: https://jkjung-avt.github.io/making-frcn-faster/

isalirezag commented 6 years ago

can any one explain what is spatial_scale please?

jkjung-avt commented 6 years ago

The "spatial_scale" in the ROIPooling layer specifies "how much the input/bottom blob has been scaled down from the original input image". For example, the VGG16 feature extractor would produce a 512x30x40 blob (conv5_3) from a 3x480x640 input image. In such a case we would set "spatial scale" to 0.0625 (or 1/16) since the height and width of ROIPooling's bottom blob is 1/16 of the input image.

Reference: roi_pooling_layer.cpp

rbgirshick / fast-rcnn

How to train rcnn for GoogLeNet #36