torrvision / crfasrnn

This repository contains the source code for the semantic image segmentation method described in the ICCV 2015 paper: Conditional Random Fields as Recurrent Neural Networks. http://crfasrnn.torr.vision/
Other
1.34k stars 462 forks source link

network saving only default parameter values #35

Closed AdrianLsk closed 7 years ago

AdrianLsk commented 8 years ago

Hi,

Thank you sharing the code, the idea of the CRF as RNN is great. I have a question regarding training the model from scratch and on a different type of images.

I ran into a problem when saving parameters into .caffemodel file. The training log shows that the network learns judging by the softmax loss going down, but when I load the snapshot, the parameters are always at the default values.

Have you or anyone experienced anything similar? I have tried to train the basic lenet example on the MNIST and there was no problem.

Thanks in advance.

Adrian Lisko

bittnt commented 8 years ago

@AdrianLsk

Thanks.

If you are discussing the problem in general. Make sure that you use the correct prototxt to load the snapshot model, which means all the layer names in testing prototxt should be the same as that in your training prototxt. Otherwise, it will load the default values.

Also, did you try to make the learning rate higher for the crf layer? It should give you quite big different parameters.

e.g. layers { name: "inference1" type: MULTI_STAGE_MEANFIELD bottom: "unary" bottom: "Q0" bottom: "data" top: "pred" blobs_lr: 10000 blobs_lr: 10000 blobs_lr:1000 #new parameter multi_stage_meanfield_param { num_iterations: 10 compatibility_mode: POTTS threshold: 2 theta_alpha: 160 theta_beta: 3 theta_gamma: 3 spatial_filter_weight: 3 bilateral_filter_weight: 5 } }

AdrianLsk commented 8 years ago

@bittnt I am training the net with your prototxt file, where I just modified the data layer in order to use my data. It looks like this:

`name: 'TVG_CRF_RNN_SEG' force_backward: true layers { top: "data" top: "label_argmax" name: "train_data" type: HDF5_DATA include { phase: TRAIN } hdf5_data_param { source: "caffe/model/train_data.txt" batch_size: 1 } }

layers { top: "data" top: "label_argmax" name: "val_data" type: HDF5_DATA include { phase: TEST } hdf5_data_param { source: "caffe/model/val_data.txt" batch_size: 1 } }

layers { bottom: 'data' top: 'conv1_1' name: 'conv1_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 64 pad: 100 kernel_size: 3 } } layers { bottom: 'conv1_1' top: 'conv1_1' name: 'relu1_1' type: RELU } layers { bottom: 'conv1_1' top: 'conv1_2' name: 'conv1_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 64 pad: 1 kernel_size: 3 } } layers { bottom: 'conv1_2' top: 'conv1_2' name: 'relu1_2' type: RELU } layers { name: 'pool1' bottom: 'conv1_2' top: 'pool1' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { name: 'conv2_1' bottom: 'pool1' top: 'conv2_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } } layers { bottom: 'conv2_1' top: 'conv2_1' name: 'relu2_1' type: RELU } layers { bottom: 'conv2_1' top: 'conv2_2' name: 'conv2_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } } layers { bottom: 'conv2_2' top: 'conv2_2' name: 'relu2_2' type: RELU } layers { bottom: 'conv2_2' top: 'pool2' name: 'pool2' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool2' top: 'conv3_1' name: 'conv3_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } layers { bottom: 'conv3_1' top: 'conv3_1' name: 'relu3_1' type: RELU } layers { bottom: 'conv3_1' top: 'conv3_2' name: 'conv3_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } layers { bottom: 'conv3_2' top: 'conv3_2' name: 'relu3_2' type: RELU } layers { bottom: 'conv3_2' top: 'conv3_3' name: 'conv3_3' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } layers { bottom: 'conv3_3' top: 'conv3_3' name: 'relu3_3' type: RELU } layers { bottom: 'conv3_3' top: 'pool3' name: 'pool3' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool3' top: 'conv4_1' name: 'conv4_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv4_1' top: 'conv4_1' name: 'relu4_1' type: RELU } layers { bottom: 'conv4_1' top: 'conv4_2' name: 'conv4_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv4_2' top: 'conv4_2' name: 'relu4_2' type: RELU } layers { bottom: 'conv4_2' top: 'conv4_3' name: 'conv4_3' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv4_3' top: 'conv4_3' name: 'relu4_3' type: RELU } layers { bottom: 'conv4_3' top: 'pool4' name: 'pool4' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool4' top: 'conv5_1' name: 'conv5_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv5_1' top: 'conv5_1' name: 'relu5_1' type: RELU } layers { bottom: 'conv5_1' top: 'conv5_2' name: 'conv5_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv5_2' top: 'conv5_2' name: 'relu5_2' type: RELU } layers { bottom: 'conv5_2' top: 'conv5_3' name: 'conv5_3' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv5_3' top: 'conv5_3' name: 'relu5_3' type: RELU } layers { bottom: 'conv5_3' top: 'pool5' name: 'pool5' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE kernel_size: 7 num_output: 4096 } } layers { bottom: 'fc6' top: 'fc6' name: 'relu6' type: RELU } layers { bottom: 'fc6' top: 'fc6' name: 'drop6' type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE kernel_size: 1 num_output: 4096 } } layers { bottom: 'fc7' top: 'fc7' name: 'relu7' type: RELU } layers { bottom: 'fc7' top: 'fc7' name: 'drop7' type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { name: 'score-fr' type: CONVOLUTION bottom: 'fc7' top: 'score' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 3 kernel_size: 1 } }

layers { type: DECONVOLUTION name: 'score2' bottom: 'score' top: 'score2' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { kernel_size: 4 stride: 2 num_output: 3 } }

layers { name: 'score-pool4' type: CONVOLUTION bottom: 'pool4' top: 'score-pool4' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 3 kernel_size: 1 } }

layers { type: CROP name: 'crop' bottom: 'score-pool4' bottom: 'score2' top: 'score-pool4c' }

layers { type: ELTWISE name: 'fuse' bottom: 'score2' bottom: 'score-pool4c' top: 'score-fused' eltwise_param { operation: SUM } }

layers { type: DECONVOLUTION name: 'score4' bottom: 'score-fused' top: 'score4' blobs_lr: 1 weight_decay: 1 convolution_param { bias_term: false kernel_size: 4 stride: 2 num_output: 3 } }

layers { name: 'score-pool3' type: CONVOLUTION bottom: 'pool3' top: 'score-pool3' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 3 kernel_size: 1 } }

layers { type: CROP name: 'crop' bottom: 'score-pool3' bottom: 'score4' top: 'score-pool3c' }

layers { type: ELTWISE name: 'fuse' bottom: 'score4' bottom: 'score-pool3c' top: 'score-final' eltwise_param { operation: SUM } }

layers { type: DECONVOLUTION name: 'upsample' bottom: 'score-final' top: 'bigscore' blobs_lr: 0 convolution_param { bias_term: false num_output: 3 kernel_size: 16 stride: 8 } }

layers { type: CROP name: 'crop' bottom: 'bigscore' bottom: 'data' top: 'coarse' }

layers { type: SPLIT name: 'splitting' bottom: 'coarse' top: 'unary' top: 'Q0' }

layers { name: "inference1" type: MULTI_STAGE_MEANFIELD bottom: "unary" bottom: "Q0" bottom: "data" top: "pred" blobs_lr: 10 blobs_lr: 10 blobs_lr: 10 #new parameter multi_stage_meanfield_param { num_iterations: 10 compatibility_mode: POTTS threshold: 2 theta_alpha: 160 theta_beta: 3 theta_gamma: 3 spatial_filter_weight: 3 bilateral_filter_weight: 5 } } layers { bottom: "pred" bottom: "label_argmax" top: "loss" name: "loss" type: SOFTMAX_LOSS } }`

After the snapshot is saved during training, I load this prototxt together with the caffemodel file in python to check the parameters. The weights have the initial zero values.

I've tried to train the net with smaller and also bigger values for crf's learning rate and also withou the crf itself. The outcome is always the same as described above..

bittnt commented 8 years ago

I am not sure if your script is correct. But could you make sure that you got the correct data and label in both training and testing phase? Ideally, for semantic image segmentation problem, you should have input image and its corresponding label map at the same resolution.

Check out Jon long & Evan's scripts:

https://gist.github.com/shelhamer/80667189b218ad570e82

name: "FCN" layer { name: "data" type: "Data" top: "data" include { phase: TRAIN } transform_param { mean_value: 104.00699 mean_value: 116.66877 mean_value: 122.67892 } data_param { source: "../../data/pascal-context/pascal-context-train-lmdb" batch_size: 1 backend: LMDB } } layer { name: "label" type: "Data" top: "label" include { phase: TRAIN } data_param { source: "../../data/pascal-context/pascal-context-train-gt59-lmdb" batch_size: 1 backend: LMDB } } layer { name: "data" type: "Data" top: "data" include { phase: TEST } transform_param { mean_value: 104.00699 mean_value: 116.66877 mean_value: 122.67892 } data_param { source: "../../data/pascal-context/pascal-context-val-lmdb" batch_size: 1 backend: LMDB } } layer { name: "label" type: "Data" top: "label" include { phase: TEST } data_param { source: "../../data/pascal-context/pascal-context-val-gt59-lmdb" batch_size: 1 backend: LMDB } }

On Tue, 15 Mar 2016 at 22:39, AdrianLsk notifications@github.com wrote:

@bittnt https://github.com/bittnt I am training the net with your prototxt file, where I just modified the data layer in order to use my data. It looks like this:

`name: 'TVG_CRF_RNN_BONE_SEG' force_backward: true layers { top: "data" top: "label_argmax" name: "train_data" type: HDF5_DATA include { phase: TRAIN } hdf5_data_param { source: "caffe/model/train_data.txt" batch_size: 1 } }

layers { top: "data" top: "label_argmax" name: "val_data" type: HDF5_DATA include { phase: TEST } hdf5_data_param { source: "caffe/model/val_data.txt" batch_size: 1 } }

layers { bottom: 'data' top: 'conv1_1' name: 'conv1_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 64 pad: 100 kernel_size: 3 } } layers { bottom: 'conv1_1' top: 'conv1_1' name: 'relu1_1' type: RELU } layers { bottom: 'conv1_1' top: 'conv1_2' name: 'conv1_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 64 pad: 1 kernel_size: 3 } } layers { bottom: 'conv1_2' top: 'conv1_2' name: 'relu1_2' type: RELU } layers { name: 'pool1' bottom: 'conv1_2' top: 'pool1' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { name: 'conv2_1' bottom: 'pool1' top: 'conv2_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } } layers { bottom: 'conv2_1' top: 'conv2_1' name: 'relu2_1' type: RELU } layers { bottom: 'conv2_1' top: 'conv2_2' name: 'conv2_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } } layers { bottom: 'conv2_2' top: 'conv2_2' name: 'relu2_2' type: RELU } layers { bottom: 'conv2_2' top: 'pool2' name: 'pool2' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool2' top: 'conv3_1' name: 'conv3_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } layers { bottom: 'conv3_1' top: 'conv3_1' name: 'relu3_1' type: RELU } layers { bottom: 'conv3_1' top: 'conv3_2' name: 'conv3_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } layers { bottom: 'conv3_2' top: 'conv3_2' name: 'relu3_2' type: RELU } layers { bottom: 'conv3_2' top: 'conv3_3' name: 'conv3_3' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } } layers { bottom: 'conv3_3' top: 'conv3_3' name: 'relu3_3' type: RELU } layers { bottom: 'conv3_3' top: 'pool3' name: 'pool3' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool3' top: 'conv4_1' name: 'conv4_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv4_1' top: 'conv4_1' name: 'relu4_1' type: RELU } layers { bottom: 'conv4_1' top: 'conv4_2' name: 'conv4_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv4_2' top: 'conv4_2' name: 'relu4_2' type: RELU } layers { bottom: 'conv4_2' top: 'conv4_3' name: 'conv4_3' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv4_3' top: 'conv4_3' name: 'relu4_3' type: RELU } layers { bottom: 'conv4_3' top: 'pool4' name: 'pool4' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool4' top: 'conv5_1' name: 'conv5_1' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv5_1' top: 'conv5_1' name: 'relu5_1' type: RELU } layers { bottom: 'conv5_1' top: 'conv5_2' name: 'conv5_2' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv5_2' top: 'conv5_2' name: 'relu5_2' type: RELU } layers { bottom: 'conv5_2' top: 'conv5_3' name: 'conv5_3' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } } layers { bottom: 'conv5_3' top: 'conv5_3' name: 'relu5_3' type: RELU } layers { bottom: 'conv5_3' top: 'pool5' name: 'pool5' type: POOLING pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layers { bottom: 'pool5' top: 'fc6' name: 'fc6' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE kernel_size: 7 num_output: 4096 } } layers { bottom: 'fc6' top: 'fc6' name: 'relu6' type: RELU } layers { bottom: 'fc6' top: 'fc6' name: 'drop6' type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: 'fc6' top: 'fc7' name: 'fc7' type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE kernel_size: 1 num_output: 4096 } } layers { bottom: 'fc7' top: 'fc7' name: 'relu7' type: RELU } layers { bottom: 'fc7' top: 'fc7' name: 'drop7' type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { name: 'score-fr' type: CONVOLUTION bottom: 'fc7' top: 'score' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 3 kernel_size: 1 } }

layers { type: DECONVOLUTION name: 'score2' bottom: 'score' top: 'score2' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { kernel_size: 4 stride: 2 num_output: 3 } }

layers { name: 'score-pool4' type: CONVOLUTION bottom: 'pool4' top: 'score-pool4' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 3 kernel_size: 1 } }

layers { type: CROP name: 'crop' bottom: 'score-pool4' bottom: 'score2' top: 'score-pool4c' }

layers { type: ELTWISE name: 'fuse' bottom: 'score2' bottom: 'score-pool4c' top: 'score-fused' eltwise_param { operation: SUM } }

layers { type: DECONVOLUTION name: 'score4' bottom: 'score-fused' top: 'score4' blobs_lr: 1 weight_decay: 1 convolution_param { bias_term: false kernel_size: 4 stride: 2 num_output: 3 } }

layers { name: 'score-pool3' type: CONVOLUTION bottom: 'pool3' top: 'score-pool3' blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { engine: CAFFE num_output: 3 kernel_size: 1 } }

layers { type: CROP name: 'crop' bottom: 'score-pool3' bottom: 'score4' top: 'score-pool3c' }

layers { type: ELTWISE name: 'fuse' bottom: 'score4' bottom: 'score-pool3c' top: 'score-final' eltwise_param { operation: SUM } }

layers { type: DECONVOLUTION name: 'upsample' bottom: 'score-final' top: 'bigscore' blobs_lr: 0 convolution_param { bias_term: false num_output: 3 kernel_size: 16 stride: 8 } }

layers { type: CROP name: 'crop' bottom: 'bigscore' bottom: 'data' top: 'coarse' }

layers { type: SPLIT name: 'splitting' bottom: 'coarse' top: 'unary' top: 'Q0' }

layers { name: "inference1" type: MULTI_STAGE_MEANFIELD bottom: "unary" bottom: "Q0" bottom: "data" top: "pred" blobs_lr: 10 blobs_lr: 10 blobs_lr: 10 #new parameter multi_stage_meanfield_param { num_iterations: 10 compatibility_mode: POTTS threshold: 2 theta_alpha: 160 theta_beta: 3 theta_gamma: 3 spatial_filter_weight: 3 bilateral_filter_weight: 5 } } layers { bottom: "pred" bottom: "label_argmax" top: "loss" name: "loss" type: SOFTMAX_LOSS } }`

After the snapshot is saved during training, I load this prototxt together with the caffemodel file in python to check the parameters. The weights have the initial zero values.

I've tried to train the net with smaller and also bigger values for crf's learning rate and also withou the crf itself. The outcome is always the same as described above..

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/torrvision/crfasrnn/issues/35#issuecomment-197055538

sadeepj commented 7 years ago

Closing old issues with no recent activity.