BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.04k stars 18.7k forks source link

Loss diverges in a few iterations #1282

Closed BestSonny closed 9 years ago

BestSonny commented 9 years ago

Anybody who can explain this.Am I doing wrong with my net design or ...? In the later,the loss function just goes to NAN. I think the reason is the problem describe in the title which causes the result.

I1015 10:46:48.905999 11365 caffe.cpp:99] Use GPU with device ID 0 I1015 10:46:51.345105 11365 caffe.cpp:107] Starting Optimization I1015 10:46:51.345240 11365 solver.cpp:32] Initializing solver from parameters: test_iter: 1000 test_interval: 1000 base_lr: 0.01 display: 200 max_iter: 450000 lr_policy: "step" gamma: 0.1 momentum: 0.9 weight_decay: 0.0005 stepsize: 10000 snapshot: 10000 snapshot_prefix: "icdar_train" solver_mode: GPU net: "train_val.prototxt" I1015 10:46:51.345363 11365 solver.cpp:67] Creating training net from net file: train_val.prototxt I1015 10:46:51.345728 11365 net.cpp:275] The NetState phase (0) differed from the phase (1) specified by a rule in layer data I1015 10:46:51.345762 11365 net.cpp:275] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy I1015 10:46:51.345878 11365 net.cpp:39] Initializing net from parameters: name: "Net" layers { top: "data" top: "label" name: "data" type: DATA data_param { source: "levelDB/icdar_train_lmdb" batch_size: 256 backend: LMDB } include { phase: TRAIN } } layers { bottom: "data" top: "conv1" name: "conv1" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 48 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv1" top: "conv2" name: "conv2" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 64 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv2" top: "conv2" name: "drop2" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "conv2" top: "convCaseInsensive" name: "convCaseInsensive" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 128 kernel_size: 8 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensive" top: "convCaseInsensive" name: "drop3" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "convCaseInsensive" top: "convCaseInsensiveSecond" name: "convCaseInsensiveSecond" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 36 kernel_size: 1 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensiveSecond" bottom: "label" top: "loss" name: "loss" type: SOFTMAX_LOSS } state { phase: TRAIN } I1015 10:46:51.346560 11365 net.cpp:67] Creating Layer data I1015 10:46:51.346585 11365 net.cpp:356] data -> data I1015 10:46:51.346613 11365 net.cpp:356] data -> label I1015 10:46:51.346633 11365 net.cpp:96] Setting up data I1015 10:46:51.346757 11365 data_layer.cpp:68] Opening lmdb levelDB/icdar_train_lmdb I1015 10:46:51.346807 11365 data_layer.cpp:128] output data size: 256,3,24,24 I1015 10:46:51.347970 11365 net.cpp:103] Top shape: 256 3 24 24 (442368) I1015 10:46:51.347993 11365 net.cpp:103] Top shape: 256 1 1 1 (256) I1015 10:46:51.348012 11365 net.cpp:67] Creating Layer conv1 I1015 10:46:51.348026 11365 net.cpp:394] conv1 <- data I1015 10:46:51.348058 11365 net.cpp:356] conv1 -> conv1 I1015 10:46:51.348088 11365 net.cpp:96] Setting up conv1 I1015 10:46:51.349117 11365 net.cpp:103] Top shape: 256 48 16 16 (3145728) I1015 10:46:51.349169 11365 net.cpp:67] Creating Layer conv2 I1015 10:46:51.349187 11365 net.cpp:394] conv2 <- conv1 I1015 10:46:51.349207 11365 net.cpp:356] conv2 -> conv2 I1015 10:46:51.349225 11365 net.cpp:96] Setting up conv2 I1015 10:46:51.360640 11365 net.cpp:103] Top shape: 256 64 8 8 (1048576) I1015 10:46:51.360690 11365 net.cpp:67] Creating Layer drop2 I1015 10:46:51.360738 11365 net.cpp:394] drop2 <- conv2 I1015 10:46:51.360757 11365 net.cpp:345] drop2 -> conv2 (in-place) I1015 10:46:51.360774 11365 net.cpp:96] Setting up drop2 I1015 10:46:51.360792 11365 net.cpp:103] Top shape: 256 64 8 8 (1048576) I1015 10:46:51.360810 11365 net.cpp:67] Creating Layer convCaseInsensive I1015 10:46:51.360823 11365 net.cpp:394] convCaseInsensive <- conv2 I1015 10:46:51.360875 11365 net.cpp:356] convCaseInsensive -> convCaseInsensive I1015 10:46:51.360894 11365 net.cpp:96] Setting up convCaseInsensive I1015 10:46:51.385164 11365 net.cpp:103] Top shape: 256 128 1 1 (32768) I1015 10:46:51.385223 11365 net.cpp:67] Creating Layer drop3 I1015 10:46:51.385239 11365 net.cpp:394] drop3 <- convCaseInsensive I1015 10:46:51.385260 11365 net.cpp:345] drop3 -> convCaseInsensive (in-place) I1015 10:46:51.385279 11365 net.cpp:96] Setting up drop3 I1015 10:46:51.385294 11365 net.cpp:103] Top shape: 256 128 1 1 (32768) I1015 10:46:51.385313 11365 net.cpp:67] Creating Layer convCaseInsensiveSecond I1015 10:46:51.385325 11365 net.cpp:394] convCaseInsensiveSecond <- convCaseInsensive I1015 10:46:51.385347 11365 net.cpp:356] convCaseInsensiveSecond -> convCaseInsensiveSecond I1015 10:46:51.385365 11365 net.cpp:96] Setting up convCaseInsensiveSecond I1015 10:46:51.385601 11365 net.cpp:103] Top shape: 256 36 1 1 (9216) I1015 10:46:51.385629 11365 net.cpp:67] Creating Layer loss I1015 10:46:51.385644 11365 net.cpp:394] loss <- convCaseInsensiveSecond I1015 10:46:51.385658 11365 net.cpp:394] loss <- label I1015 10:46:51.385673 11365 net.cpp:356] loss -> loss I1015 10:46:51.385689 11365 net.cpp:96] Setting up loss I1015 10:46:51.385710 11365 net.cpp:103] Top shape: 1 1 1 1 (1) I1015 10:46:51.385725 11365 net.cpp:109] with loss weight 1 I1015 10:46:51.385774 11365 net.cpp:170] loss needs backward computation. I1015 10:46:51.385792 11365 net.cpp:170] convCaseInsensiveSecond needs backward computation. I1015 10:46:51.385807 11365 net.cpp:170] drop3 needs backward computation. I1015 10:46:51.385819 11365 net.cpp:170] convCaseInsensive needs backward computation. I1015 10:46:51.385838 11365 net.cpp:170] drop2 needs backward computation. I1015 10:46:51.385851 11365 net.cpp:170] conv2 needs backward computation. I1015 10:46:51.385866 11365 net.cpp:170] conv1 needs backward computation. I1015 10:46:51.385884 11365 net.cpp:172] data does not need backward computation. I1015 10:46:51.385900 11365 net.cpp:208] This network produces output loss I1015 10:46:51.385922 11365 net.cpp:467] Collecting Learning Rate and Weight Decay. I1015 10:46:51.385941 11365 net.cpp:219] Network initialization done. I1015 10:46:51.385956 11365 net.cpp:220] Memory required for data: 23041028 I1015 10:46:51.386291 11365 solver.cpp:151] Creating test net (#0) specified by net file: train_val.prototxt I1015 10:46:51.386332 11365 net.cpp:275] The NetState phase (1) differed from the phase (0) specified by a rule in layer data I1015 10:46:51.386464 11365 net.cpp:39] Initializing net from parameters: name: "Net" layers { top: "data" top: "label" name: "data" type: DATA data_param { source: "levelDB/icdar_test_lmdb" batch_size: 50 backend: LMDB } include { phase: TEST } } layers { bottom: "data" top: "conv1" name: "conv1" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 48 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv1" top: "conv2" name: "conv2" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 64 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv2" top: "conv2" name: "drop2" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "conv2" top: "convCaseInsensive" name: "convCaseInsensive" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 128 kernel_size: 8 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensive" top: "convCaseInsensive" name: "drop3" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "convCaseInsensive" top: "convCaseInsensiveSecond" name: "convCaseInsensiveSecond" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 36 kernel_size: 1 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensiveSecond" bottom: "label" top: "accuracy" name: "accuracy" type: ACCURACY include { phase: TEST } } layers { bottom: "convCaseInsensiveSecond" bottom: "label" top: "loss" name: "loss" type: SOFTMAX_LOSS } state { phase: TEST } I1015 10:46:51.387393 11365 net.cpp:67] Creating Layer data I1015 10:46:51.387423 11365 net.cpp:356] data -> data I1015 10:46:51.387452 11365 net.cpp:356] data -> label I1015 10:46:51.387480 11365 net.cpp:96] Setting up data I1015 10:46:51.387570 11365 data_layer.cpp:68] Opening lmdb levelDB/icdar_test_lmdb I1015 10:46:51.387614 11365 data_layer.cpp:128] output data size: 50,3,24,24 I1015 10:46:51.387886 11365 net.cpp:103] Top shape: 50 3 24 24 (86400) I1015 10:46:51.387913 11365 net.cpp:103] Top shape: 50 1 1 1 (50) I1015 10:46:51.387943 11365 net.cpp:67] Creating Layer label_data_1_split I1015 10:46:51.387969 11365 net.cpp:394] label_data_1_split <- label I1015 10:46:51.387991 11365 net.cpp:356] label_data_1_split -> label_data_1_split_0 I1015 10:46:51.388023 11365 net.cpp:356] label_data_1_split -> label_data_1_split_1 I1015 10:46:51.388047 11365 net.cpp:96] Setting up label_data_1_split I1015 10:46:51.388072 11365 net.cpp:103] Top shape: 50 1 1 1 (50) I1015 10:46:51.388092 11365 net.cpp:103] Top shape: 50 1 1 1 (50) I1015 10:46:51.388118 11365 net.cpp:67] Creating Layer conv1 I1015 10:46:51.388139 11365 net.cpp:394] conv1 <- data I1015 10:46:51.388165 11365 net.cpp:356] conv1 -> conv1 I1015 10:46:51.388192 11365 net.cpp:96] Setting up conv1 I1015 10:46:51.389050 11365 net.cpp:103] Top shape: 50 48 16 16 (614400) I1015 10:46:51.389094 11365 net.cpp:67] Creating Layer conv2 I1015 10:46:51.389120 11365 net.cpp:394] conv2 <- conv1 I1015 10:46:51.389147 11365 net.cpp:356] conv2 -> conv2 I1015 10:46:51.389174 11365 net.cpp:96] Setting up conv2 I1015 10:46:51.401437 11365 net.cpp:103] Top shape: 50 64 8 8 (204800) I1015 10:46:51.401482 11365 net.cpp:67] Creating Layer drop2 I1015 10:46:51.401499 11365 net.cpp:394] drop2 <- conv2 I1015 10:46:51.401515 11365 net.cpp:345] drop2 -> conv2 (in-place) I1015 10:46:51.401531 11365 net.cpp:96] Setting up drop2 I1015 10:46:51.401546 11365 net.cpp:103] Top shape: 50 64 8 8 (204800) I1015 10:46:51.401568 11365 net.cpp:67] Creating Layer convCaseInsensive I1015 10:46:51.401582 11365 net.cpp:394] convCaseInsensive <- conv2 I1015 10:46:51.401598 11365 net.cpp:356] convCaseInsensive -> convCaseInsensive I1015 10:46:51.401615 11365 net.cpp:96] Setting up convCaseInsensive I1015 10:46:51.425907 11365 net.cpp:103] Top shape: 50 128 1 1 (6400) I1015 10:46:51.425969 11365 net.cpp:67] Creating Layer drop3 I1015 10:46:51.425987 11365 net.cpp:394] drop3 <- convCaseInsensive I1015 10:46:51.426007 11365 net.cpp:345] drop3 -> convCaseInsensive (in-place) I1015 10:46:51.426025 11365 net.cpp:96] Setting up drop3 I1015 10:46:51.426041 11365 net.cpp:103] Top shape: 50 128 1 1 (6400) I1015 10:46:51.426059 11365 net.cpp:67] Creating Layer convCaseInsensiveSecond I1015 10:46:51.426090 11365 net.cpp:394] convCaseInsensiveSecond <- convCaseInsensive I1015 10:46:51.426108 11365 net.cpp:356] convCaseInsensiveSecond -> convCaseInsensiveSecond I1015 10:46:51.426139 11365 net.cpp:96] Setting up convCaseInsensiveSecond I1015 10:46:51.426408 11365 net.cpp:103] Top shape: 50 36 1 1 (1800) I1015 10:46:51.426435 11365 net.cpp:67] Creating Layer convCaseInsensiveSecond_convCaseInsensiveSecond_0_split I1015 10:46:51.426450 11365 net.cpp:394] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split <- convCaseInsensiveSecond I1015 10:46:51.426468 11365 net.cpp:356] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split -> convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_0 I1015 10:46:51.426491 11365 net.cpp:356] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split -> convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_1 I1015 10:46:51.426508 11365 net.cpp:96] Setting up convCaseInsensiveSecond_convCaseInsensiveSecond_0_split I1015 10:46:51.426527 11365 net.cpp:103] Top shape: 50 36 1 1 (1800) I1015 10:46:51.426540 11365 net.cpp:103] Top shape: 50 36 1 1 (1800) I1015 10:46:51.426558 11365 net.cpp:67] Creating Layer accuracy I1015 10:46:51.426573 11365 net.cpp:394] accuracy <- convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_0 I1015 10:46:51.426591 11365 net.cpp:394] accuracy <- label_data_1_split_0 I1015 10:46:51.426607 11365 net.cpp:356] accuracy -> accuracy I1015 10:46:51.426623 11365 net.cpp:96] Setting up accuracy I1015 10:46:51.426638 11365 net.cpp:103] Top shape: 1 1 1 1 (1) I1015 10:46:51.426659 11365 net.cpp:67] Creating Layer loss I1015 10:46:51.426676 11365 net.cpp:394] loss <- convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_1 I1015 10:46:51.426692 11365 net.cpp:394] loss <- label_data_1_split_1 I1015 10:46:51.426710 11365 net.cpp:356] loss -> loss I1015 10:46:51.426726 11365 net.cpp:96] Setting up loss I1015 10:46:51.426743 11365 net.cpp:103] Top shape: 1 1 1 1 (1) I1015 10:46:51.426756 11365 net.cpp:109] with loss weight 1 I1015 10:46:51.426777 11365 net.cpp:170] loss needs backward computation. I1015 10:46:51.426790 11365 net.cpp:172] accuracy does not need backward computation. I1015 10:46:51.426802 11365 net.cpp:170] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split needs backward computation. I1015 10:46:51.426815 11365 net.cpp:170] convCaseInsensiveSecond needs backward computation. I1015 10:46:51.426827 11365 net.cpp:170] drop3 needs backward computation. I1015 10:46:51.426839 11365 net.cpp:170] convCaseInsensive needs backward computation. I1015 10:46:51.426854 11365 net.cpp:170] drop2 needs backward computation. I1015 10:46:51.426867 11365 net.cpp:170] conv2 needs backward computation. I1015 10:46:51.426882 11365 net.cpp:170] conv1 needs backward computation. I1015 10:46:51.426934 11365 net.cpp:172] label_data_1_split does not need backward computation. I1015 10:46:51.426946 11365 net.cpp:172] data does not need backward computation. I1015 10:46:51.426962 11365 net.cpp:208] This network produces output accuracy I1015 10:46:51.426980 11365 net.cpp:208] This network produces output loss I1015 10:46:51.427003 11365 net.cpp:467] Collecting Learning Rate and Weight Decay. I1015 10:46:51.427023 11365 net.cpp:219] Network initialization done. I1015 10:46:51.427041 11365 net.cpp:220] Memory required for data: 4515008 I1015 10:46:51.427086 11365 solver.cpp:41] Solver scaffolding done. I1015 10:46:51.427126 11365 solver.cpp:160] Solving Net I1015 10:46:51.427165 11365 solver.cpp:247] Iteration 0, Testing net (#0) I1015 10:47:19.943495 11365 solver.cpp:298] Test net output #0: accuracy = 0.0273202 I1015 10:47:19.943835 11365 solver.cpp:298] Test net output #1: loss = 3.9109 (* 1 = 3.9109 loss) I1015 10:47:20.212116 11365 solver.cpp:191] Iteration 0, loss = 5.27274 I1015 10:47:20.212206 11365 solver.cpp:206] Train net output #0: loss = 5.27274 (* 1 = 5.27274 loss) I1015 10:47:20.212280 11365 solver.cpp:403] Iteration 0, lr = 0.01 I1015 10:48:19.794435 11365 solver.cpp:191] Iteration 200, loss = nan I1015 10:48:19.796959 11365 solver.cpp:206] Train net output #0: loss = nan (* 1 = nan loss) I1015 10:48:19.796989 11365 solver.cpp:403] Iteration 200, lr = 0.01

sguada commented 9 years ago

Just try a smaller base_lr

On Tuesday, October 14, 2014, He Pan notifications@github.com wrote:

Is there anybody who can explain this?Am I doing wrong with my net design or ...?

I1015 10:46:48.905999 11365 caffe.cpp:99] Use GPU with device ID 0 I1015 10:46:51.345105 11365 caffe.cpp:107] Starting Optimization I1015 10:46:51.345240 11365 solver.cpp:32] Initializing solver from parameters: test_iter: 1000 test_interval: 1000 base_lr: 0.01 display: 200 max_iter: 450000 lr_policy: "step" gamma: 0.1 momentum: 0.9 weight_decay: 0.0005 stepsize: 10000 snapshot: 10000 snapshot_prefix: "icdar_train" solver_mode: GPU net: "train_val.prototxt" I1015 10:46:51.345363 11365 solver.cpp:67] Creating training net from net file: train_val.prototxt I1015 10:46:51.345728 11365 net.cpp:275] The NetState phase (0) differed from the phase (1) specified by a rule in layer data I1015 10:46:51.345762 11365 net.cpp:275] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy I1015 10:46:51.345878 11365 net.cpp:39] Initializing net from parameters: name: "Net" layers { top: "data" top: "label" name: "data" type: DATA data_param { source: "levelDB/icdar_train_lmdb" batch_size: 256 backend: LMDB } include { phase: TRAIN } } layers { bottom: "data" top: "conv1" name: "conv1" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 48 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv1" top: "conv2" name: "conv2" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 64 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv2" top: "conv2" name: "drop2" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "conv2" top: "convCaseInsensive" name: "convCaseInsensive" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 128 kernel_size: 8 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensive" top: "convCaseInsensive" name: "drop3" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "convCaseInsensive" top: "convCaseInsensiveSecond" name: "convCaseInsensiveSecond" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 36 kernel_size: 1 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensiveSecond" bottom: "label" top: "loss" name: "loss" type: SOFTMAX_LOSS } state { phase: TRAIN } I1015 10:46:51.346560 11365 net.cpp:67] Creating Layer data I1015 10:46:51.346585 11365 net.cpp:356] data -> data I1015 10:46:51.346613 11365 net.cpp:356] data -> label I1015 10:46:51.346633 11365 net.cpp:96] Setting up data I1015 10:46:51.346757 11365 data_layer.cpp:68] Opening lmdb levelDB/icdar_train_lmdb I1015 10:46:51.346807 11365 data_layer.cpp:128] output data size: 256,3,24,24 I1015 10:46:51.347970 11365 net.cpp:103] Top shape: 256 3 24 24 (442368) I1015 10:46:51.347993 11365 net.cpp:103] Top shape: 256 1 1 1 (256) I1015 10:46:51.348012 11365 net.cpp:67] Creating Layer conv1 I1015 10:46:51.348026 11365 net.cpp:394] conv1 <- data I1015 10:46:51.348058 11365 net.cpp:356] conv1 -> conv1 I1015 10:46:51.348088 11365 net.cpp:96] Setting up conv1 I1015 10:46:51.349117 11365 net.cpp:103] Top shape: 256 48 16 16 (3145728) I1015 10:46:51.349169 11365 net.cpp:67] Creating Layer conv2 I1015 10:46:51.349187 11365 net.cpp:394] conv2 <- conv1 I1015 10:46:51.349207 11365 net.cpp:356] conv2 -> conv2 I1015 10:46:51.349225 11365 net.cpp:96] Setting up conv2 I1015 10:46:51.360640 11365 net.cpp:103] Top shape: 256 64 8 8 (1048576) I1015 10:46:51.360690 11365 net.cpp:67] Creating Layer drop2 I1015 10:46:51.360738 11365 net.cpp:394] drop2 <- conv2 I1015 10:46:51.360757 11365 net.cpp:345] drop2 -> conv2 (in-place) I1015 10:46:51.360774 11365 net.cpp:96] Setting up drop2 I1015 10:46:51.360792 11365 net.cpp:103] Top shape: 256 64 8 8 (1048576) I1015 10:46:51.360810 11365 net.cpp:67] Creating Layer convCaseInsensive I1015 10:46:51.360823 11365 net.cpp:394] convCaseInsensive <- conv2 I1015 10:46:51.360875 11365 net.cpp:356] convCaseInsensive -> convCaseInsensive I1015 10:46:51.360894 11365 net.cpp:96] Setting up convCaseInsensive I1015 10:46:51.385164 11365 net.cpp:103] Top shape: 256 128 1 1 (32768) I1015 10:46:51.385223 11365 net.cpp:67] Creating Layer drop3 I1015 10:46:51.385239 11365 net.cpp:394] drop3 <- convCaseInsensive I1015 10:46:51.385260 11365 net.cpp:345] drop3 -> convCaseInsensive (in-place) I1015 10:46:51.385279 11365 net.cpp:96] Setting up drop3 I1015 10:46:51.385294 11365 net.cpp:103] Top shape: 256 128 1 1 (32768) I1015 10:46:51.385313 11365 net.cpp:67] Creating Layer convCaseInsensiveSecond I1015 10:46:51.385325 11365 net.cpp:394] convCaseInsensiveSecond <- convCaseInsensive I1015 10:46:51.385347 11365 net.cpp:356] convCaseInsensiveSecond -> convCaseInsensiveSecond I1015 10:46:51.385365 11365 net.cpp:96] Setting up convCaseInsensiveSecond I1015 10:46:51.385601 11365 net.cpp:103] Top shape: 256 36 1 1 (9216) I1015 10:46:51.385629 11365 net.cpp:67] Creating Layer loss I1015 10:46:51.385644 11365 net.cpp:394] loss <- convCaseInsensiveSecond I1015 10:46:51.385658 11365 net.cpp:394] loss <- label I1015 10:46:51.385673 11365 net.cpp:356] loss -> loss I1015 10:46:51.385689 11365 net.cpp:96] Setting up loss I1015 10:46:51.385710 11365 net.cpp:103] Top shape: 1 1 1 1 (1) I1015 10:46:51.385725 11365 net.cpp:109] with loss weight 1 I1015 10:46:51.385774 11365 net.cpp:170] loss needs backward computation. I1015 10:46:51.385792 11365 net.cpp:170] convCaseInsensiveSecond needs backward computation. I1015 10:46:51.385807 11365 net.cpp:170] drop3 needs backward computation. I1015 10:46:51.385819 11365 net.cpp:170] convCaseInsensive needs backward computation. I1015 10:46:51.385838 11365 net.cpp:170] drop2 needs backward computation. I1015 10:46:51.385851 11365 net.cpp:170] conv2 needs backward computation. I1015 10:46:51.385866 11365 net.cpp:170] conv1 needs backward computation. I1015 10:46:51.385884 11365 net.cpp:172] data does not need backward computation. I1015 10:46:51.385900 11365 net.cpp:208] This network produces output loss I1015 10:46:51.385922 11365 net.cpp:467] Collecting Learning Rate and Weight Decay. I1015 10:46:51.385941 11365 net.cpp:219] Network initialization done. I1015 10:46:51.385956 11365 net.cpp:220] Memory required for data: 23041028 I1015 10:46:51.386291 11365 solver.cpp:151] Creating test net (#0) specified by net file: train_val.prototxt I1015 10:46:51.386332 11365 net.cpp:275] The NetState phase (1) differed from the phase (0) specified by a rule in layer data I1015 10:46:51.386464 11365 net.cpp:39] Initializing net from parameters: name: "Net" layers { top: "data" top: "label" name: "data" type: DATA data_param { source: "levelDB/icdar_test_lmdb" batch_size: 50 backend: LMDB } include { phase: TEST } } layers { bottom: "data" top: "conv1" name: "conv1" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 48 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv1" top: "conv2" name: "conv2" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 64 kernel_size: 9 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "conv2" top: "conv2" name: "drop2" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "conv2" top: "convCaseInsensive" name: "convCaseInsensive" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 128 kernel_size: 8 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensive" top: "convCaseInsensive" name: "drop3" type: DROPOUT dropout_param { dropout_ratio: 0.5 } } layers { bottom: "convCaseInsensive" top: "convCaseInsensiveSecond" name: "convCaseInsensiveSecond" type: CONVOLUTION blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 36 kernel_size: 1 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { bottom: "convCaseInsensiveSecond" bottom: "label" top: "accuracy" name: "accuracy" type: ACCURACY include { phase: TEST } } layers { bottom: "convCaseInsensiveSecond" bottom: "label" top: "loss" name: "loss" type: SOFTMAX_LOSS } state { phase: TEST } I1015 10:46:51.387393 11365 net.cpp:67] Creating Layer data I1015 10:46:51.387423 11365 net.cpp:356] data -> data I1015 10:46:51.387452 11365 net.cpp:356] data -> label I1015 10:46:51.387480 11365 net.cpp:96] Setting up data I1015 10:46:51.387570 11365 data_layer.cpp:68] Opening lmdb levelDB/icdar_test_lmdb I1015 10:46:51.387614 11365 data_layer.cpp:128] output data size: 50,3,24,24 I1015 10:46:51.387886 11365 net.cpp:103] Top shape: 50 3 24 24 (86400) I1015 10:46:51.387913 11365 net.cpp:103] Top shape: 50 1 1 1 (50) I1015 10:46:51.387943 11365 net.cpp:67] Creating Layer label_data_1_split I1015 10:46:51.387969 11365 net.cpp:394] label_data_1_split <- label I1015 10:46:51.387991 11365 net.cpp:356] label_data_1_split -> label_data_1_split_0 I1015 10:46:51.388023 11365 net.cpp:356] label_data_1_split -> label_data_1_split_1 I1015 10:46:51.388047 11365 net.cpp:96] Setting up label_data_1_split I1015 10:46:51.388072 11365 net.cpp:103] Top shape: 50 1 1 1 (50) I1015 10:46:51.388092 11365 net.cpp:103] Top shape: 50 1 1 1 (50) I1015 10:46:51.388118 11365 net.cpp:67] Creating Layer conv1 I1015 10:46:51.388139 11365 net.cpp:394] conv1 <- data I1015 10:46:51.388165 11365 net.cpp:356] conv1 -> conv1 I1015 10:46:51.388192 11365 net.cpp:96] Setting up conv1 I1015 10:46:51.389050 11365 net.cpp:103] Top shape: 50 48 16 16 (614400) I1015 10:46:51.389094 11365 net.cpp:67] Creating Layer conv2 I1015 10:46:51.389120 11365 net.cpp:394] conv2 <- conv1 I1015 10:46:51.389147 11365 net.cpp:356] conv2 -> conv2 I1015 10:46:51.389174 11365 net.cpp:96] Setting up conv2 I1015 10:46:51.401437 11365 net.cpp:103] Top shape: 50 64 8 8 (204800) I1015 10:46:51.401482 11365 net.cpp:67] Creating Layer drop2 I1015 10:46:51.401499 11365 net.cpp:394] drop2 <- conv2 I1015 10:46:51.401515 11365 net.cpp:345] drop2 -> conv2 (in-place) I1015 10:46:51.401531 11365 net.cpp:96] Setting up drop2 I1015 10:46:51.401546 11365 net.cpp:103] Top shape: 50 64 8 8 (204800) I1015 10:46:51.401568 11365 net.cpp:67] Creating Layer convCaseInsensive I1015 10:46:51.401582 11365 net.cpp:394] convCaseInsensive <- conv2 I1015 10:46:51.401598 11365 net.cpp:356] convCaseInsensive -> convCaseInsensive I1015 10:46:51.401615 11365 net.cpp:96] Setting up convCaseInsensive I1015 10:46:51.425907 11365 net.cpp:103] Top shape: 50 128 1 1 (6400) I1015 10:46:51.425969 11365 net.cpp:67] Creating Layer drop3 I1015 10:46:51.425987 11365 net.cpp:394] drop3 <- convCaseInsensive I1015 10:46:51.426007 11365 net.cpp:345] drop3 -> convCaseInsensive (in-place) I1015 10:46:51.426025 11365 net.cpp:96] Setting up drop3 I1015 10:46:51.426041 11365 net.cpp:103] Top shape: 50 128 1 1 (6400) I1015 10:46:51.426059 11365 net.cpp:67] Creating Layer convCaseInsensiveSecond I1015 10:46:51.426090 11365 net.cpp:394] convCaseInsensiveSecond <- convCaseInsensive I1015 10:46:51.426108 11365 net.cpp:356] convCaseInsensiveSecond -> convCaseInsensiveSecond I1015 10:46:51.426139 11365 net.cpp:96] Setting up convCaseInsensiveSecond I1015 10:46:51.426408 11365 net.cpp:103] Top shape: 50 36 1 1 (1800) I1015 10:46:51.426435 11365 net.cpp:67] Creating Layer convCaseInsensiveSecond_convCaseInsensiveSecond_0_split I1015 10:46:51.426450 11365 net.cpp:394] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split <- convCaseInsensiveSecond I1015 10:46:51.426468 11365 net.cpp:356] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split -> convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_0 I1015 10:46:51.426491 11365 net.cpp:356] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split -> convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_1 I1015 10:46:51.426508 11365 net.cpp:96] Setting up convCaseInsensiveSecond_convCaseInsensiveSecond_0_split I1015 10:46:51.426527 11365 net.cpp:103] Top shape: 50 36 1 1 (1800) I1015 10:46:51.426540 11365 net.cpp:103] Top shape: 50 36 1 1 (1800) I1015 10:46:51.426558 11365 net.cpp:67] Creating Layer accuracy I1015 10:46:51.426573 11365 net.cpp:394] accuracy <- convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_0 I1015 10:46:51.426591 11365 net.cpp:394] accuracy <- label_data_1_split_0 I1015 10:46:51.426607 11365 net.cpp:356] accuracy -> accuracy I1015 10:46:51.426623 11365 net.cpp:96] Setting up accuracy I1015 10:46:51.426638 11365 net.cpp:103] Top shape: 1 1 1 1 (1) I1015 10:46:51.426659 11365 net.cpp:67] Creating Layer loss I1015 10:46:51.426676 11365 net.cpp:394] loss <- convCaseInsensiveSecond_convCaseInsensiveSecond_0_split_1 I1015 10:46:51.426692 11365 net.cpp:394] loss <- label_data_1_split_1 I1015 10:46:51.426710 11365 net.cpp:356] loss -> loss I1015 10:46:51.426726 11365 net.cpp:96] Setting up loss I1015 10:46:51.426743 11365 net.cpp:103] Top shape: 1 1 1 1 (1) I1015 10:46:51.426756 11365 net.cpp:109] with loss weight 1 I1015 10:46:51.426777 11365 net.cpp:170] loss needs backward computation. I1015 10:46:51.426790 11365 net.cpp:172] accuracy does not need backward computation. I1015 10:46:51.426802 11365 net.cpp:170] convCaseInsensiveSecond_convCaseInsensiveSecond_0_split needs backward computation. I1015 10:46:51.426815 11365 net.cpp:170] convCaseInsensiveSecond needs backward computation. I1015 10:46:51.426827 11365 net.cpp:170] drop3 needs backward computation. I1015 10:46:51.426839 11365 net.cpp:170] convCaseInsensive needs backward computation. I1015 10:46:51.426854 11365 net.cpp:170] drop2 needs backward computation. I1015 10:46:51.426867 11365 net.cpp:170] conv2 needs backward computation. I1015 10:46:51.426882 11365 net.cpp:170] conv1 needs backward computation. I1015 10:46:51.426934 11365 net.cpp:172] label_data_1_split does not need backward computation. I1015 10:46:51.426946 11365 net.cpp:172] data does not need backward computation. I1015 10:46:51.426962 11365 net.cpp:208] This network produces output accuracy I1015 10:46:51.426980 11365 net.cpp:208] This network produces output loss I1015 10:46:51.427003 11365 net.cpp:467] Collecting Learning Rate and Weight Decay. I1015 10:46:51.427023 11365 net.cpp:219] Network initialization done. I1015 10:46:51.427041 11365 net.cpp:220] Memory required for data: 4515008 I1015 10:46:51.427086 11365 solver.cpp:41] Solver scaffolding done. I1015 10:46:51.427126 11365 solver.cpp:160] Solving Net I1015 10:46:51.427165 11365 solver.cpp:247] Iteration 0, Testing net (#0) I1015 10:47:19.943495 11365 solver.cpp:298] Test net output #0: accuracy = 0.0273202 I1015 10:47:19.943835 11365 solver.cpp:298] Test net output #1 https://github.com/BVLC/caffe/issues/1: loss = 3.9109 (* 1 = 3.9109 loss) I1015 10:47:20.212116 11365 solver.cpp:191] Iteration 0, loss = 5.27274 I1015 10:47:20.212206 11365 solver.cpp:206] Train net output #0: loss = 5.27274 (* 1 = 5.27274 loss) I1015 10:47:20.212280 11365 solver.cpp:403] Iteration 0, lr = 0.01 I1015 10:48:19.794435 11365 solver.cpp:191] Iteration 200, loss = nan I1015 10:48:19.796959 11365 solver.cpp:206] Train net output #0: loss = nan (* 1 = nan loss) I1015 10:48:19.796989 11365 solver.cpp:403] Iteration 200, lr = 0.01

— Reply to this email directly or view it on GitHub https://github.com/BVLC/caffe/issues/1282.

Sergio

BestSonny commented 9 years ago

It works.A lot of thanks.

Also,I wander if there exists any realized code about "Maxout" proposed by Lan K.Goodfellow,which conbines dropout to enhance calssification performance.

If not,I have to find some solution.

Hopefully you can give me docs and reference material about Caffe.

shelhamer commented 9 years ago

Check any of the documentation on the project home page http://caffe.berkeleyvision.org/ and ask questions on the caffe-users mailing list.