wy1iu / sphereface

Implementation for <SphereFace: Deep Hypersphere Embedding for Face Recognition> in CVPR'17.
MIT License
1.58k stars 543 forks source link

loss=87.3365 #103

Open LaviLiu opened 6 years ago

LaviLiu commented 6 years ago

我使用sphereface-20在casia_clean上训练的时候,loss一直等于87.3365。改小学习率后loss会慢慢从一个较小的值上升到87.3365。请问这是什么情况?谁可以帮帮我? 下面是我的log: ` I0717 22:54:56.291251 23375 caffe.cpp:218] Using GPUs 0, 1 I0717 22:54:56.306454 23375 caffe.cpp:223] GPU 0: GeForce GTX TITAN X I0717 22:54:56.307116 23375 caffe.cpp:223] GPU 1: GeForce GTX TITAN X I0717 22:54:56.756727 23375 solver.cpp:44] Initializing solver from parameters: base_lr: 0.001 display: 100 max_iter: 28000 lr_policy: "multistep" gamma: 0.1 momentum: 0.9 weight_decay: 0.0005 snapshot_prefix: "result/sphereface_model" solver_mode: GPU device_id: 0 net: "code/sphereface_model.prototxt" train_state { level: 0 stage: "" } stepvalue: 16000 stepvalue: 24000 stepvalue: 28000 I0717 22:54:56.756784 23375 solver.cpp:87] Creating training net from net file: code/sphereface_model.prototxt I0717 22:54:56.758580 23375 net.cpp:51] Initializing net from parameters: name: "SpherefaceNet-20" state { phase: TRAIN level: 0 stage: "" } layer { name: "data" type: "ImageData" top: "data" top: "label" transform_param { scale: 0.0078125 mirror: true mean_value: 127.5 mean_value: 127.5 mean_value: 127.5 } image_data_param { source: "data/CASIA-WebFace-112X96.txt" batch_size: 256 shuffle: true } } layer { name: "conv1_1" type: "Convolution" bottom: "data" top: "conv1_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 1 kernel_size: 3 stride: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1_1" type: "PReLU" bottom: "conv1_1" top: "conv1_1" } layer { name: "conv1_2" type: "Convolution" bottom: "conv1_1" top: "conv1_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 64 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1_2" type: "PReLU" bottom: "conv1_2" top: "conv1_2" } layer { name: "conv1_3" type: "Convolution" bottom: "conv1_2" top: "conv1_3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 64 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1_3" type: "PReLU" bottom: "conv1_3" top: "conv1_3" } layer { name: "res1_3" type: "Eltwise" bottom: "conv1_1" bottom: "conv1_3" top: "res1_3" eltwise_param { operation: SUM } } layer { name: "conv2_1" type: "Convolution" bottom: "res1_3" top: "conv2_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 stride: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "relu2_1" type: "PReLU" bottom: "conv2_1" top: "conv2_1" } layer { name: "conv2_2" type: "Convolution" bottom: "conv2_1" top: "conv2_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu2_2" type: "PReLU" bottom: "conv2_2" top: "conv2_2" } layer { name: "conv2_3" type: "Convolution" bottom: "conv2_2" top: "conv2_3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu2_3" type: "PReLU" bottom: "conv2_3" top: "conv2_3" } layer { name: "res2_3" type: "Eltwise" bottom: "conv2_1" bottom: "conv2_3" top: "res2_3" eltwise_param { operation: SUM } } layer { name: "conv2_4" type: "Convolution" bottom: "res2_3" top: "conv2_4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu2_4" type: "PReLU" bottom: "conv2_4" top: "conv2_4" } layer { name: "conv2_5" type: "Convolution" bottom: "conv2_4" top: "conv2_5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu2_5" type: "PReLU" bottom: "conv2_5" top: "conv2_5" } layer { name: "res2_5" type: "Eltwise" bottom: "res2_3" bottom: "conv2_5" top: "res2_5" eltwise_param { operation: SUM } } layer { name: "conv3_1" type: "Convolution" bottom: "res2_5" top: "conv3_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_1" type: "PReLU" bottom: "conv3_1" top: "conv3_1" } layer { name: "conv3_2" type: "Convolution" bottom: "conv3_1" top: "conv3_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_2" type: "PReLU" bottom: "conv3_2" top: "conv3_2" } layer { name: "conv3_3" type: "Convolution" bottom: "conv3_2" top: "conv3_3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_3" type: "PReLU" bottom: "conv3_3" top: "conv3_3" } layer { name: "res3_3" type: "Eltwise" bottom: "conv3_1" bottom: "conv3_3" top: "res3_3" eltwise_param { operation: SUM } } layer { name: "conv3_4" type: "Convolution" bottom: "res3_3" top: "conv3_4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_4" type: "PReLU" bottom: "conv3_4" top: "conv3_4" } layer { name: "conv3_5" type: "Convolution" bottom: "conv3_4" top: "conv3_5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_5" type: "PReLU" bottom: "conv3_5" top: "conv3_5" } layer { name: "res3_5" type: "Eltwise" bottom: "res3_3" bottom: "conv3_5" top: "res3_5" eltwise_param { operation: SUM } } layer { name: "conv3_6" type: "Convolution" bottom: "res3_5" top: "conv3_6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_6" type: "PReLU" bottom: "conv3_6" top: "conv3_6" } layer { name: "conv3_7" type: "Convolution" bottom: "conv3_6" top: "conv3_7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_7" type: "PReLU" bottom: "conv3_7" top: "conv3_7" } layer { name: "res3_7" type: "Eltwise" bottom: "res3_5" bottom: "conv3_7" top: "res3_7" eltwise_param { operation: SUM } } layer { name: "conv3_8" type: "Convolution" bottom: "res3_7" top: "conv3_8" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_8" type: "PReLU" bottom: "conv3_8" top: "conv3_8" } layer { name: "conv3_9" type: "Convolution" bottom: "conv3_8" top: "conv3_9" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_9" type: "PReLU" bottom: "conv3_9" top: "conv3_9" } layer { name: "res3_9" type: "Eltwise" bottom: "res3_7" bottom: "conv3_9" top: "res3_9" eltwise_param { operation: SUM } } layer { name: "conv4_1" type: "Convolution" bottom: "res3_9" top: "conv4_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 512 pad: 1 kernel_size: 3 stride: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "relu4_1" type: "PReLU" bottom: "conv4_1" top: "conv4_1" } layer { name: "conv4_2" type: "Convolution" bottom: "conv4_1" top: "conv4_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 512 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu4_2" type: "PReLU" bottom: "conv4_2" top: "conv4_2" } layer { name: "conv4_3" type: "Convolution" bottom: "conv4_2" top: "conv4_3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 512 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu4_3" type: "PReLU" bottom: "conv4_3" top: "conv4_3" } layer { name: "res4_3" type: "Eltwise" bottom: "conv4_1" bottom: "conv4_3" top: "res4_3" eltwise_param { operation: SUM } } layer { name: "fc5" type: "InnerProduct" bottom: "res4_3" top: "fc5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 512 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "fc6" type: "MarginInnerProduct" bottom: "fc5" bottom: "label" top: "fc6" top: "lambda" param { lr_mult: 1 decay_mult: 1 } margin_inner_product_param { num_output: 10572 type: QUADRUPLE weight_filler { type: "xavier" } base: 1000 gamma: 0.12 power: 1 iteration: 0 lambda_min: 5 } } layer { name: "softmax_loss" type: "SoftmaxWithLoss" bottom: "fc6" bottom: "label" top: "softmax_loss" } I0717 22:54:56.758893 23375 layer_factory.hpp:77] Creating layer data I0717 22:54:56.758935 23375 net.cpp:84] Creating Layer data I0717 22:54:56.758946 23375 net.cpp:380] data -> data I0717 22:54:56.759002 23375 net.cpp:380] data -> label I0717 22:54:56.759483 23375 image_data_layer.cpp:38] Opening file data/CASIA-WebFace-112X96.txt I0717 22:54:56.768184 23375 image_data_layer.cpp:53] Shuffling data I0717 22:54:56.769104 23375 image_data_layer.cpp:63] A total of 29485 images. I0717 22:54:56.770115 23375 image_data_layer.cpp:90] output data size: 256,3,112,96 I0717 22:54:56.844602 23375 net.cpp:122] Setting up data I0717 22:54:56.844641 23375 net.cpp:129] Top shape: 256 3 112 96 (8257536) I0717 22:54:56.844651 23375 net.cpp:129] Top shape: 256 (256) I0717 22:54:56.844660 23375 net.cpp:137] Memory required for data: 33031168 I0717 22:54:56.844672 23375 layer_factory.hpp:77] Creating layer label_data_1_split I0717 22:54:56.844698 23375 net.cpp:84] Creating Layer label_data_1_split I0717 22:54:56.844709 23375 net.cpp:406] label_data_1_split <- label I0717 22:54:56.844729 23375 net.cpp:380] label_data_1_split -> label_data_1_split_0 I0717 22:54:56.844749 23375 net.cpp:380] label_data_1_split -> label_data_1_split_1 I0717 22:54:56.844832 23375 net.cpp:122] Setting up label_data_1_split I0717 22:54:56.844846 23375 net.cpp:129] Top shape: 256 (256) I0717 22:54:56.844853 23375 net.cpp:129] Top shape: 256 (256) I0717 22:54:56.844859 23375 net.cpp:137] Memory required for data: 33033216 I0717 22:54:56.844866 23375 layer_factory.hpp:77] Creating layer conv1_1 I0717 22:54:56.844892 23375 net.cpp:84] Creating Layer conv1_1 I0717 22:54:56.844900 23375 net.cpp:406] conv1_1 <- data I0717 22:54:56.844916 23375 net.cpp:380] conv1_1 -> conv1_1 I0717 22:54:57.131815 23375 net.cpp:122] Setting up conv1_1 I0717 22:54:57.131852 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.131860 23375 net.cpp:137] Memory required for data: 209193984 I0717 22:54:57.131904 23375 layer_factory.hpp:77] Creating layer relu1_1 I0717 22:54:57.131924 23375 net.cpp:84] Creating Layer relu1_1 I0717 22:54:57.131940 23375 net.cpp:406] relu1_1 <- conv1_1 I0717 22:54:57.131953 23375 net.cpp:367] relu1_1 -> conv1_1 (in-place) I0717 22:54:57.132736 23375 net.cpp:122] Setting up relu1_1 I0717 22:54:57.132753 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.132761 23375 net.cpp:137] Memory required for data: 385354752 I0717 22:54:57.132786 23375 layer_factory.hpp:77] Creating layer conv1_1_relu1_1_0_split I0717 22:54:57.132798 23375 net.cpp:84] Creating Layer conv1_1_relu1_1_0_split I0717 22:54:57.132815 23375 net.cpp:406] conv1_1_relu1_1_0_split <- conv1_1 I0717 22:54:57.132833 23375 net.cpp:380] conv1_1_relu1_1_0_split -> conv1_1_relu1_1_0_split_0 I0717 22:54:57.132848 23375 net.cpp:380] conv1_1_relu1_1_0_split -> conv1_1_relu1_1_0_split_1 I0717 22:54:57.132897 23375 net.cpp:122] Setting up conv1_1_relu1_1_0_split I0717 22:54:57.132910 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.132920 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.132925 23375 net.cpp:137] Memory required for data: 737676288 I0717 22:54:57.132931 23375 layer_factory.hpp:77] Creating layer conv1_2 I0717 22:54:57.132951 23375 net.cpp:84] Creating Layer conv1_2 I0717 22:54:57.132959 23375 net.cpp:406] conv1_2 <- conv1_1_relu1_1_0_split_0 I0717 22:54:57.132972 23375 net.cpp:380] conv1_2 -> conv1_2 I0717 22:54:57.135821 23375 net.cpp:122] Setting up conv1_2 I0717 22:54:57.135839 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.135848 23375 net.cpp:137] Memory required for data: 913837056 I0717 22:54:57.135875 23375 layer_factory.hpp:77] Creating layer relu1_2 I0717 22:54:57.135892 23375 net.cpp:84] Creating Layer relu1_2 I0717 22:54:57.135905 23375 net.cpp:406] relu1_2 <- conv1_2 I0717 22:54:57.135941 23375 net.cpp:367] relu1_2 -> conv1_2 (in-place) I0717 22:54:57.136690 23375 net.cpp:122] Setting up relu1_2 I0717 22:54:57.136708 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.136715 23375 net.cpp:137] Memory required for data: 1089997824 I0717 22:54:57.136737 23375 layer_factory.hpp:77] Creating layer conv1_3 I0717 22:54:57.136754 23375 net.cpp:84] Creating Layer conv1_3 I0717 22:54:57.136765 23375 net.cpp:406] conv1_3 <- conv1_2 I0717 22:54:57.136778 23375 net.cpp:380] conv1_3 -> conv1_3 I0717 22:54:57.139103 23375 net.cpp:122] Setting up conv1_3 I0717 22:54:57.139122 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.139129 23375 net.cpp:137] Memory required for data: 1266158592 I0717 22:54:57.139154 23375 layer_factory.hpp:77] Creating layer relu1_3 I0717 22:54:57.139165 23375 net.cpp:84] Creating Layer relu1_3 I0717 22:54:57.139176 23375 net.cpp:406] relu1_3 <- conv1_3 I0717 22:54:57.139187 23375 net.cpp:367] relu1_3 -> conv1_3 (in-place) I0717 22:54:57.139940 23375 net.cpp:122] Setting up relu1_3 I0717 22:54:57.139956 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.139963 23375 net.cpp:137] Memory required for data: 1442319360 I0717 22:54:57.139989 23375 layer_factory.hpp:77] Creating layer res1_3 I0717 22:54:57.140007 23375 net.cpp:84] Creating Layer res1_3 I0717 22:54:57.140017 23375 net.cpp:406] res1_3 <- conv1_1_relu1_1_0_split_1 I0717 22:54:57.140027 23375 net.cpp:406] res1_3 <- conv1_3 I0717 22:54:57.140038 23375 net.cpp:380] res1_3 -> res1_3 I0717 22:54:57.140074 23375 net.cpp:122] Setting up res1_3 I0717 22:54:57.140089 23375 net.cpp:129] Top shape: 256 64 56 48 (44040192) I0717 22:54:57.140096 23375 net.cpp:137] Memory required for data: 1618480128 I0717 22:54:57.140102 23375 layer_factory.hpp:77] Creating layer conv2_1 I0717 22:54:57.140118 23375 net.cpp:84] Creating Layer conv2_1 I0717 22:54:57.140126 23375 net.cpp:406] conv2_1 <- res1_3 I0717 22:54:57.140138 23375 net.cpp:380] conv2_1 -> conv2_1 I0717 22:54:57.142500 23375 net.cpp:122] Setting up conv2_1 I0717 22:54:57.142519 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.142526 23375 net.cpp:137] Memory required for data: 1706560512 I0717 22:54:57.142552 23375 layer_factory.hpp:77] Creating layer relu2_1 I0717 22:54:57.142565 23375 net.cpp:84] Creating Layer relu2_1 I0717 22:54:57.142580 23375 net.cpp:406] relu2_1 <- conv2_1 I0717 22:54:57.142594 23375 net.cpp:367] relu2_1 -> conv2_1 (in-place) I0717 22:54:57.142751 23375 net.cpp:122] Setting up relu2_1 I0717 22:54:57.142768 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.142776 23375 net.cpp:137] Memory required for data: 1794640896 I0717 22:54:57.142784 23375 layer_factory.hpp:77] Creating layer conv2_1_relu2_1_0_split I0717 22:54:57.142796 23375 net.cpp:84] Creating Layer conv2_1_relu2_1_0_split I0717 22:54:57.142808 23375 net.cpp:406] conv2_1_relu2_1_0_split <- conv2_1 I0717 22:54:57.142819 23375 net.cpp:380] conv2_1_relu2_1_0_split -> conv2_1_relu2_1_0_split_0 I0717 22:54:57.142832 23375 net.cpp:380] conv2_1_relu2_1_0_split -> conv2_1_relu2_1_0_split_1 I0717 22:54:57.142879 23375 net.cpp:122] Setting up conv2_1_relu2_1_0_split I0717 22:54:57.142891 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.142900 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.142906 23375 net.cpp:137] Memory required for data: 1970801664 I0717 22:54:57.142913 23375 layer_factory.hpp:77] Creating layer conv2_2 I0717 22:54:57.142928 23375 net.cpp:84] Creating Layer conv2_2 I0717 22:54:57.142937 23375 net.cpp:406] conv2_2 <- conv2_1_relu2_1_0_split_0 I0717 22:54:57.142949 23375 net.cpp:380] conv2_2 -> conv2_2 I0717 22:54:57.149407 23375 net.cpp:122] Setting up conv2_2 I0717 22:54:57.149425 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.149433 23375 net.cpp:137] Memory required for data: 2058882048 I0717 22:54:57.149458 23375 layer_factory.hpp:77] Creating layer relu2_2 I0717 22:54:57.149469 23375 net.cpp:84] Creating Layer relu2_2 I0717 22:54:57.149497 23375 net.cpp:406] relu2_2 <- conv2_2 I0717 22:54:57.149509 23375 net.cpp:367] relu2_2 -> conv2_2 (in-place) I0717 22:54:57.149673 23375 net.cpp:122] Setting up relu2_2 I0717 22:54:57.149691 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.149698 23375 net.cpp:137] Memory required for data: 2146962432 I0717 22:54:57.149708 23375 layer_factory.hpp:77] Creating layer conv2_3 I0717 22:54:57.149724 23375 net.cpp:84] Creating Layer conv2_3 I0717 22:54:57.149734 23375 net.cpp:406] conv2_3 <- conv2_2 I0717 22:54:57.149746 23375 net.cpp:380] conv2_3 -> conv2_3 I0717 22:54:57.156194 23375 net.cpp:122] Setting up conv2_3 I0717 22:54:57.156213 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.156221 23375 net.cpp:137] Memory required for data: 2235042816 I0717 22:54:57.156239 23375 layer_factory.hpp:77] Creating layer relu2_3 I0717 22:54:57.156251 23375 net.cpp:84] Creating Layer relu2_3 I0717 22:54:57.156260 23375 net.cpp:406] relu2_3 <- conv2_3 I0717 22:54:57.156270 23375 net.cpp:367] relu2_3 -> conv2_3 (in-place) I0717 22:54:57.156453 23375 net.cpp:122] Setting up relu2_3 I0717 22:54:57.156468 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.156476 23375 net.cpp:137] Memory required for data: 2323123200 I0717 22:54:57.156484 23375 layer_factory.hpp:77] Creating layer res2_3 I0717 22:54:57.156496 23375 net.cpp:84] Creating Layer res2_3 I0717 22:54:57.156503 23375 net.cpp:406] res2_3 <- conv2_1_relu2_1_0_split_1 I0717 22:54:57.156512 23375 net.cpp:406] res2_3 <- conv2_3 I0717 22:54:57.156522 23375 net.cpp:380] res2_3 -> res2_3 I0717 22:54:57.156563 23375 net.cpp:122] Setting up res2_3 I0717 22:54:57.156575 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.156582 23375 net.cpp:137] Memory required for data: 2411203584 I0717 22:54:57.156589 23375 layer_factory.hpp:77] Creating layer res2_3_res2_3_0_split I0717 22:54:57.156597 23375 net.cpp:84] Creating Layer res2_3_res2_3_0_split I0717 22:54:57.156605 23375 net.cpp:406] res2_3_res2_3_0_split <- res2_3 I0717 22:54:57.156615 23375 net.cpp:380] res2_3_res2_3_0_split -> res2_3_res2_3_0_split_0 I0717 22:54:57.156627 23375 net.cpp:380] res2_3_res2_3_0_split -> res2_3_res2_3_0_split_1 I0717 22:54:57.156688 23375 net.cpp:122] Setting up res2_3_res2_3_0_split I0717 22:54:57.156702 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.156710 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.156716 23375 net.cpp:137] Memory required for data: 2587364352 I0717 22:54:57.156723 23375 layer_factory.hpp:77] Creating layer conv2_4 I0717 22:54:57.156738 23375 net.cpp:84] Creating Layer conv2_4 I0717 22:54:57.156746 23375 net.cpp:406] conv2_4 <- res2_3_res2_3_0_split_0 I0717 22:54:57.156759 23375 net.cpp:380] conv2_4 -> conv2_4 I0717 22:54:57.163316 23375 net.cpp:122] Setting up conv2_4 I0717 22:54:57.163336 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.163343 23375 net.cpp:137] Memory required for data: 2675444736 I0717 22:54:57.163357 23375 layer_factory.hpp:77] Creating layer relu2_4 I0717 22:54:57.163368 23375 net.cpp:84] Creating Layer relu2_4 I0717 22:54:57.163377 23375 net.cpp:406] relu2_4 <- conv2_4 I0717 22:54:57.163386 23375 net.cpp:367] relu2_4 -> conv2_4 (in-place) I0717 22:54:57.163552 23375 net.cpp:122] Setting up relu2_4 I0717 22:54:57.163566 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.163573 23375 net.cpp:137] Memory required for data: 2763525120 I0717 22:54:57.163583 23375 layer_factory.hpp:77] Creating layer conv2_5 I0717 22:54:57.163599 23375 net.cpp:84] Creating Layer conv2_5 I0717 22:54:57.163609 23375 net.cpp:406] conv2_5 <- conv2_4 I0717 22:54:57.163625 23375 net.cpp:380] conv2_5 -> conv2_5 I0717 22:54:57.174243 23375 net.cpp:122] Setting up conv2_5 I0717 22:54:57.174264 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.174273 23375 net.cpp:137] Memory required for data: 2851605504 I0717 22:54:57.174298 23375 layer_factory.hpp:77] Creating layer relu2_5 I0717 22:54:57.174329 23375 net.cpp:84] Creating Layer relu2_5 I0717 22:54:57.174340 23375 net.cpp:406] relu2_5 <- conv2_5 I0717 22:54:57.174355 23375 net.cpp:367] relu2_5 -> conv2_5 (in-place) I0717 22:54:57.174533 23375 net.cpp:122] Setting up relu2_5 I0717 22:54:57.174547 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.174554 23375 net.cpp:137] Memory required for data: 2939685888 I0717 22:54:57.174564 23375 layer_factory.hpp:77] Creating layer res2_5 I0717 22:54:57.174582 23375 net.cpp:84] Creating Layer res2_5 I0717 22:54:57.174589 23375 net.cpp:406] res2_5 <- res2_3_res2_3_0_split_1 I0717 22:54:57.174598 23375 net.cpp:406] res2_5 <- conv2_5 I0717 22:54:57.174608 23375 net.cpp:380] res2_5 -> res2_5 I0717 22:54:57.174649 23375 net.cpp:122] Setting up res2_5 I0717 22:54:57.174660 23375 net.cpp:129] Top shape: 256 128 28 24 (22020096) I0717 22:54:57.174669 23375 net.cpp:137] Memory required for data: 3027766272 I0717 22:54:57.174676 23375 layer_factory.hpp:77] Creating layer conv3_1 I0717 22:54:57.174693 23375 net.cpp:84] Creating Layer conv3_1 I0717 22:54:57.174703 23375 net.cpp:406] conv3_1 <- res2_5 I0717 22:54:57.174715 23375 net.cpp:380] conv3_1 -> conv3_1 I0717 22:54:57.179035 23375 net.cpp:122] Setting up conv3_1 I0717 22:54:57.179054 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.179062 23375 net.cpp:137] Memory required for data: 3071806464 I0717 22:54:57.179086 23375 layer_factory.hpp:77] Creating layer relu3_1 I0717 22:54:57.179097 23375 net.cpp:84] Creating Layer relu3_1 I0717 22:54:57.179105 23375 net.cpp:406] relu3_1 <- conv3_1 I0717 22:54:57.179116 23375 net.cpp:367] relu3_1 -> conv3_1 (in-place) I0717 22:54:57.179275 23375 net.cpp:122] Setting up relu3_1 I0717 22:54:57.179291 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.179297 23375 net.cpp:137] Memory required for data: 3115846656 I0717 22:54:57.179306 23375 layer_factory.hpp:77] Creating layer conv3_1_relu3_1_0_split I0717 22:54:57.179316 23375 net.cpp:84] Creating Layer conv3_1_relu3_1_0_split I0717 22:54:57.179324 23375 net.cpp:406] conv3_1_relu3_1_0_split <- conv3_1 I0717 22:54:57.179334 23375 net.cpp:380] conv3_1_relu3_1_0_split -> conv3_1_relu3_1_0_split_0 I0717 22:54:57.179352 23375 net.cpp:380] conv3_1_relu3_1_0_split -> conv3_1_relu3_1_0_split_1 I0717 22:54:57.179412 23375 net.cpp:122] Setting up conv3_1_relu3_1_0_split I0717 22:54:57.179425 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.179433 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.179440 23375 net.cpp:137] Memory required for data: 3203927040 I0717 22:54:57.179446 23375 layer_factory.hpp:77] Creating layer conv3_2 I0717 22:54:57.179461 23375 net.cpp:84] Creating Layer conv3_2 I0717 22:54:57.179473 23375 net.cpp:406] conv3_2 <- conv3_1_relu3_1_0_split_0 I0717 22:54:57.179486 23375 net.cpp:380] conv3_2 -> conv3_2 I0717 22:54:57.201071 23375 net.cpp:122] Setting up conv3_2 I0717 22:54:57.201093 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.201102 23375 net.cpp:137] Memory required for data: 3247967232 I0717 22:54:57.201125 23375 layer_factory.hpp:77] Creating layer relu3_2 I0717 22:54:57.201138 23375 net.cpp:84] Creating Layer relu3_2 I0717 22:54:57.201146 23375 net.cpp:406] relu3_2 <- conv3_2 I0717 22:54:57.201158 23375 net.cpp:367] relu3_2 -> conv3_2 (in-place) I0717 22:54:57.201321 23375 net.cpp:122] Setting up relu3_2 I0717 22:54:57.201335 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.201344 23375 net.cpp:137] Memory required for data: 3292007424 I0717 22:54:57.201352 23375 layer_factory.hpp:77] Creating layer conv3_3 I0717 22:54:57.201369 23375 net.cpp:84] Creating Layer conv3_3 I0717 22:54:57.201380 23375 net.cpp:406] conv3_3 <- conv3_2 I0717 22:54:57.201391 23375 net.cpp:380] conv3_3 -> conv3_3 I0717 22:54:57.222266 23375 net.cpp:122] Setting up conv3_3 I0717 22:54:57.222290 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.222296 23375 net.cpp:137] Memory required for data: 3336047616 I0717 22:54:57.222321 23375 layer_factory.hpp:77] Creating layer relu3_3 I0717 22:54:57.222363 23375 net.cpp:84] Creating Layer relu3_3 I0717 22:54:57.222374 23375 net.cpp:406] relu3_3 <- conv3_3 I0717 22:54:57.222385 23375 net.cpp:367] relu3_3 -> conv3_3 (in-place) I0717 22:54:57.222537 23375 net.cpp:122] Setting up relu3_3 I0717 22:54:57.222550 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.222558 23375 net.cpp:137] Memory required for data: 3380087808 I0717 22:54:57.222573 23375 layer_factory.hpp:77] Creating layer res3_3 I0717 22:54:57.222589 23375 net.cpp:84] Creating Layer res3_3 I0717 22:54:57.222600 23375 net.cpp:406] res3_3 <- conv3_1_relu3_1_0_split_1 I0717 22:54:57.222609 23375 net.cpp:406] res3_3 <- conv3_3 I0717 22:54:57.222620 23375 net.cpp:380] res3_3 -> res3_3 I0717 22:54:57.222666 23375 net.cpp:122] Setting up res3_3 I0717 22:54:57.222678 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.222684 23375 net.cpp:137] Memory required for data: 3424128000 I0717 22:54:57.222692 23375 layer_factory.hpp:77] Creating layer res3_3_res3_3_0_split I0717 22:54:57.222704 23375 net.cpp:84] Creating Layer res3_3_res3_3_0_split I0717 22:54:57.222712 23375 net.cpp:406] res3_3_res3_3_0_split <- res3_3 I0717 22:54:57.222721 23375 net.cpp:380] res3_3_res3_3_0_split -> res3_3_res3_3_0_split_0 I0717 22:54:57.222738 23375 net.cpp:380] res3_3_res3_3_0_split -> res3_3_res3_3_0_split_1 I0717 22:54:57.222789 23375 net.cpp:122] Setting up res3_3_res3_3_0_split I0717 22:54:57.222800 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.222808 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.222815 23375 net.cpp:137] Memory required for data: 3512208384 I0717 22:54:57.222821 23375 layer_factory.hpp:77] Creating layer conv3_4 I0717 22:54:57.222836 23375 net.cpp:84] Creating Layer conv3_4 I0717 22:54:57.222846 23375 net.cpp:406] conv3_4 <- res3_3_res3_3_0_split_0 I0717 22:54:57.222857 23375 net.cpp:380] conv3_4 -> conv3_4 I0717 22:54:57.244524 23375 net.cpp:122] Setting up conv3_4 I0717 22:54:57.244549 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.244556 23375 net.cpp:137] Memory required for data: 3556248576 I0717 22:54:57.244570 23375 layer_factory.hpp:77] Creating layer relu3_4 I0717 22:54:57.244581 23375 net.cpp:84] Creating Layer relu3_4 I0717 22:54:57.244590 23375 net.cpp:406] relu3_4 <- conv3_4 I0717 22:54:57.244601 23375 net.cpp:367] relu3_4 -> conv3_4 (in-place) I0717 22:54:57.244756 23375 net.cpp:122] Setting up relu3_4 I0717 22:54:57.244771 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.244777 23375 net.cpp:137] Memory required for data: 3600288768 I0717 22:54:57.244787 23375 layer_factory.hpp:77] Creating layer conv3_5 I0717 22:54:57.244804 23375 net.cpp:84] Creating Layer conv3_5 I0717 22:54:57.244814 23375 net.cpp:406] conv3_5 <- conv3_4 I0717 22:54:57.244827 23375 net.cpp:380] conv3_5 -> conv3_5 I0717 22:54:57.265857 23375 net.cpp:122] Setting up conv3_5 I0717 22:54:57.265877 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.265885 23375 net.cpp:137] Memory required for data: 3644328960 I0717 22:54:57.265909 23375 layer_factory.hpp:77] Creating layer relu3_5 I0717 22:54:57.265921 23375 net.cpp:84] Creating Layer relu3_5 I0717 22:54:57.265929 23375 net.cpp:406] relu3_5 <- conv3_5 I0717 22:54:57.265940 23375 net.cpp:367] relu3_5 -> conv3_5 (in-place) I0717 22:54:57.266609 23375 net.cpp:122] Setting up relu3_5 I0717 22:54:57.266626 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.266633 23375 net.cpp:137] Memory required for data: 3688369152 I0717 22:54:57.266655 23375 layer_factory.hpp:77] Creating layer res3_5 I0717 22:54:57.266667 23375 net.cpp:84] Creating Layer res3_5 I0717 22:54:57.266681 23375 net.cpp:406] res3_5 <- res3_3_res3_3_0_split_1 I0717 22:54:57.266691 23375 net.cpp:406] res3_5 <- conv3_5 I0717 22:54:57.266706 23375 net.cpp:380] res3_5 -> res3_5 I0717 22:54:57.266752 23375 net.cpp:122] Setting up res3_5 I0717 22:54:57.266765 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.266794 23375 net.cpp:137] Memory required for data: 3732409344 I0717 22:54:57.266803 23375 layer_factory.hpp:77] Creating layer res3_5_res3_5_0_split I0717 22:54:57.266814 23375 net.cpp:84] Creating Layer res3_5_res3_5_0_split I0717 22:54:57.266826 23375 net.cpp:406] res3_5_res3_5_0_split <- res3_5 I0717 22:54:57.266837 23375 net.cpp:380] res3_5_res3_5_0_split -> res3_5_res3_5_0_split_0 I0717 22:54:57.266854 23375 net.cpp:380] res3_5_res3_5_0_split -> res3_5_res3_5_0_split_1 I0717 22:54:57.266907 23375 net.cpp:122] Setting up res3_5_res3_5_0_split I0717 22:54:57.266927 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.266937 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.266942 23375 net.cpp:137] Memory required for data: 3820489728 I0717 22:54:57.266948 23375 layer_factory.hpp:77] Creating layer conv3_6 I0717 22:54:57.266965 23375 net.cpp:84] Creating Layer conv3_6 I0717 22:54:57.266974 23375 net.cpp:406] conv3_6 <- res3_5_res3_5_0_split_0 I0717 22:54:57.266988 23375 net.cpp:380] conv3_6 -> conv3_6 I0717 22:54:57.290452 23375 net.cpp:122] Setting up conv3_6 I0717 22:54:57.290483 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.290491 23375 net.cpp:137] Memory required for data: 3864529920 I0717 22:54:57.290510 23375 layer_factory.hpp:77] Creating layer relu3_6 I0717 22:54:57.290529 23375 net.cpp:84] Creating Layer relu3_6 I0717 22:54:57.290551 23375 net.cpp:406] relu3_6 <- conv3_6 I0717 22:54:57.290565 23375 net.cpp:367] relu3_6 -> conv3_6 (in-place) I0717 22:54:57.290731 23375 net.cpp:122] Setting up relu3_6 I0717 22:54:57.290746 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.290753 23375 net.cpp:137] Memory required for data: 3908570112 I0717 22:54:57.290763 23375 layer_factory.hpp:77] Creating layer conv3_7 I0717 22:54:57.290784 23375 net.cpp:84] Creating Layer conv3_7 I0717 22:54:57.290796 23375 net.cpp:406] conv3_7 <- conv3_6 I0717 22:54:57.290808 23375 net.cpp:380] conv3_7 -> conv3_7 I0717 22:54:57.311866 23375 net.cpp:122] Setting up conv3_7 I0717 22:54:57.311885 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.311893 23375 net.cpp:137] Memory required for data: 3952610304 I0717 22:54:57.311918 23375 layer_factory.hpp:77] Creating layer relu3_7 I0717 22:54:57.311928 23375 net.cpp:84] Creating Layer relu3_7 I0717 22:54:57.311935 23375 net.cpp:406] relu3_7 <- conv3_7 I0717 22:54:57.311945 23375 net.cpp:367] relu3_7 -> conv3_7 (in-place) I0717 22:54:57.312115 23375 net.cpp:122] Setting up relu3_7 I0717 22:54:57.312129 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.312136 23375 net.cpp:137] Memory required for data: 3996650496 I0717 22:54:57.312145 23375 layer_factory.hpp:77] Creating layer res3_7 I0717 22:54:57.312158 23375 net.cpp:84] Creating Layer res3_7 I0717 22:54:57.312171 23375 net.cpp:406] res3_7 <- res3_5_res3_5_0_split_1 I0717 22:54:57.312180 23375 net.cpp:406] res3_7 <- conv3_7 I0717 22:54:57.312188 23375 net.cpp:380] res3_7 -> res3_7 I0717 22:54:57.312233 23375 net.cpp:122] Setting up res3_7 I0717 22:54:57.312247 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.312252 23375 net.cpp:137] Memory required for data: 4040690688 I0717 22:54:57.312260 23375 layer_factory.hpp:77] Creating layer res3_7_res3_7_0_split I0717 22:54:57.312269 23375 net.cpp:84] Creating Layer res3_7_res3_7_0_split I0717 22:54:57.312276 23375 net.cpp:406] res3_7_res3_7_0_split <- res3_7 I0717 22:54:57.312288 23375 net.cpp:380] res3_7_res3_7_0_split -> res3_7_res3_7_0_split_0 I0717 22:54:57.312304 23375 net.cpp:380] res3_7_res3_7_0_split -> res3_7_res3_7_0_split_1 I0717 22:54:57.312376 23375 net.cpp:122] Setting up res3_7_res3_7_0_split I0717 22:54:57.312389 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.312397 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.312404 23375 net.cpp:137] Memory required for data: 4128771072 I0717 22:54:57.312410 23375 layer_factory.hpp:77] Creating layer conv3_8 I0717 22:54:57.312428 23375 net.cpp:84] Creating Layer conv3_8 I0717 22:54:57.312453 23375 net.cpp:406] conv3_8 <- res3_7_res3_7_0_split_0 I0717 22:54:57.312470 23375 net.cpp:380] conv3_8 -> conv3_8 I0717 22:54:57.333389 23375 net.cpp:122] Setting up conv3_8 I0717 22:54:57.333410 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.333417 23375 net.cpp:137] Memory required for data: 4172811264 I0717 22:54:57.333441 23375 layer_factory.hpp:77] Creating layer relu3_8 I0717 22:54:57.333456 23375 net.cpp:84] Creating Layer relu3_8 I0717 22:54:57.333463 23375 net.cpp:406] relu3_8 <- conv3_8 I0717 22:54:57.333477 23375 net.cpp:367] relu3_8 -> conv3_8 (in-place) I0717 22:54:57.333642 23375 net.cpp:122] Setting up relu3_8 I0717 22:54:57.333657 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.333662 23375 net.cpp:137] Memory required for data: 4216851456 I0717 22:54:57.333673 23375 layer_factory.hpp:77] Creating layer conv3_9 I0717 22:54:57.333690 23375 net.cpp:84] Creating Layer conv3_9 I0717 22:54:57.333700 23375 net.cpp:406] conv3_9 <- conv3_8 I0717 22:54:57.333714 23375 net.cpp:380] conv3_9 -> conv3_9 I0717 22:54:57.354790 23375 net.cpp:122] Setting up conv3_9 I0717 22:54:57.354810 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.354817 23375 net.cpp:137] Memory required for data: 4260891648 I0717 22:54:57.354841 23375 layer_factory.hpp:77] Creating layer relu3_9 I0717 22:54:57.354854 23375 net.cpp:84] Creating Layer relu3_9 I0717 22:54:57.354863 23375 net.cpp:406] relu3_9 <- conv3_9 I0717 22:54:57.354876 23375 net.cpp:367] relu3_9 -> conv3_9 (in-place) I0717 22:54:57.355044 23375 net.cpp:122] Setting up relu3_9 I0717 22:54:57.355058 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.355065 23375 net.cpp:137] Memory required for data: 4304931840 I0717 22:54:57.355074 23375 layer_factory.hpp:77] Creating layer res3_9 I0717 22:54:57.355087 23375 net.cpp:84] Creating Layer res3_9 I0717 22:54:57.355099 23375 net.cpp:406] res3_9 <- res3_7_res3_7_0_split_1 I0717 22:54:57.355108 23375 net.cpp:406] res3_9 <- conv3_9 I0717 22:54:57.355123 23375 net.cpp:380] res3_9 -> res3_9 I0717 22:54:57.355175 23375 net.cpp:122] Setting up res3_9 I0717 22:54:57.355188 23375 net.cpp:129] Top shape: 256 256 14 12 (11010048) I0717 22:54:57.355195 23375 net.cpp:137] Memory required for data: 4348972032 I0717 22:54:57.355201 23375 layer_factory.hpp:77] Creating layer conv4_1 I0717 22:54:57.355218 23375 net.cpp:84] Creating Layer conv4_1 I0717 22:54:57.355227 23375 net.cpp:406] conv4_1 <- res3_9 I0717 22:54:57.355242 23375 net.cpp:380] conv4_1 -> conv4_1 I0717 22:54:57.368865 23375 net.cpp:122] Setting up conv4_1 I0717 22:54:57.368886 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.368893 23375 net.cpp:137] Memory required for data: 4370992128 I0717 22:54:57.368917 23375 layer_factory.hpp:77] Creating layer relu4_1 I0717 22:54:57.368932 23375 net.cpp:84] Creating Layer relu4_1 I0717 22:54:57.368939 23375 net.cpp:406] relu4_1 <- conv4_1 I0717 22:54:57.368949 23375 net.cpp:367] relu4_1 -> conv4_1 (in-place) I0717 22:54:57.369094 23375 net.cpp:122] Setting up relu4_1 I0717 22:54:57.369108 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.369114 23375 net.cpp:137] Memory required for data: 4393012224 I0717 22:54:57.369124 23375 layer_factory.hpp:77] Creating layer conv4_1_relu4_1_0_split I0717 22:54:57.369137 23375 net.cpp:84] Creating Layer conv4_1_relu4_1_0_split I0717 22:54:57.369148 23375 net.cpp:406] conv4_1_relu4_1_0_split <- conv4_1 I0717 22:54:57.369158 23375 net.cpp:380] conv4_1_relu4_1_0_split -> conv4_1_relu4_1_0_split_0 I0717 22:54:57.369179 23375 net.cpp:380] conv4_1_relu4_1_0_split -> conv4_1_relu4_1_0_split_1 I0717 22:54:57.369243 23375 net.cpp:122] Setting up conv4_1_relu4_1_0_split I0717 22:54:57.369257 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.369266 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.369272 23375 net.cpp:137] Memory required for data: 4437052416 I0717 22:54:57.369278 23375 layer_factory.hpp:77] Creating layer conv4_2 I0717 22:54:57.369324 23375 net.cpp:84] Creating Layer conv4_2 I0717 22:54:57.369334 23375 net.cpp:406] conv4_2 <- conv4_1_relu4_1_0_split_0 I0717 22:54:57.369346 23375 net.cpp:380] conv4_2 -> conv4_2 I0717 22:54:57.449687 23375 net.cpp:122] Setting up conv4_2 I0717 22:54:57.449720 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.449728 23375 net.cpp:137] Memory required for data: 4459072512 I0717 22:54:57.449741 23375 layer_factory.hpp:77] Creating layer relu4_2 I0717 22:54:57.449767 23375 net.cpp:84] Creating Layer relu4_2 I0717 22:54:57.449775 23375 net.cpp:406] relu4_2 <- conv4_2 I0717 22:54:57.449790 23375 net.cpp:367] relu4_2 -> conv4_2 (in-place) I0717 22:54:57.449945 23375 net.cpp:122] Setting up relu4_2 I0717 22:54:57.449959 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.449965 23375 net.cpp:137] Memory required for data: 4481092608 I0717 22:54:57.449975 23375 layer_factory.hpp:77] Creating layer conv4_3 I0717 22:54:57.449996 23375 net.cpp:84] Creating Layer conv4_3 I0717 22:54:57.450006 23375 net.cpp:406] conv4_3 <- conv4_2 I0717 22:54:57.450019 23375 net.cpp:380] conv4_3 -> conv4_3 I0717 22:54:57.530601 23375 net.cpp:122] Setting up conv4_3 I0717 22:54:57.530637 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.530643 23375 net.cpp:137] Memory required for data: 4503112704 I0717 22:54:57.530658 23375 layer_factory.hpp:77] Creating layer relu4_3 I0717 22:54:57.530685 23375 net.cpp:84] Creating Layer relu4_3 I0717 22:54:57.530694 23375 net.cpp:406] relu4_3 <- conv4_3 I0717 22:54:57.530706 23375 net.cpp:367] relu4_3 -> conv4_3 (in-place) I0717 22:54:57.531402 23375 net.cpp:122] Setting up relu4_3 I0717 22:54:57.531419 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.531426 23375 net.cpp:137] Memory required for data: 4525132800 I0717 22:54:57.531450 23375 layer_factory.hpp:77] Creating layer res4_3 I0717 22:54:57.531464 23375 net.cpp:84] Creating Layer res4_3 I0717 22:54:57.531476 23375 net.cpp:406] res4_3 <- conv4_1_relu4_1_0_split_1 I0717 22:54:57.531484 23375 net.cpp:406] res4_3 <- conv4_3 I0717 22:54:57.531496 23375 net.cpp:380] res4_3 -> res4_3 I0717 22:54:57.531545 23375 net.cpp:122] Setting up res4_3 I0717 22:54:57.531558 23375 net.cpp:129] Top shape: 256 512 7 6 (5505024) I0717 22:54:57.531564 23375 net.cpp:137] Memory required for data: 4547152896 I0717 22:54:57.531570 23375 layer_factory.hpp:77] Creating layer fc5 I0717 22:54:57.531586 23375 net.cpp:84] Creating Layer fc5 I0717 22:54:57.531597 23375 net.cpp:406] fc5 <- res4_3 I0717 22:54:57.531608 23375 net.cpp:380] fc5 -> fc5 I0717 22:54:57.651867 23375 net.cpp:122] Setting up fc5 I0717 22:54:57.651906 23375 net.cpp:129] Top shape: 256 512 (131072) I0717 22:54:57.651912 23375 net.cpp:137] Memory required for data: 4547677184 I0717 22:54:57.651942 23375 layer_factory.hpp:77] Creating layer fc6 I0717 22:54:57.651970 23375 net.cpp:84] Creating Layer fc6 I0717 22:54:57.651993 23375 net.cpp:406] fc6 <- fc5 I0717 22:54:57.652007 23375 net.cpp:406] fc6 <- label_data_1_split_0 I0717 22:54:57.652024 23375 net.cpp:380] fc6 -> fc6 I0717 22:54:57.652050 23375 net.cpp:380] fc6 -> lambda I0717 22:54:57.710330 23375 net.cpp:122] Setting up fc6 I0717 22:54:57.710366 23375 net.cpp:129] Top shape: 256 10572 (2706432) I0717 22:54:57.710374 23375 net.cpp:129] Top shape: 1 (1) I0717 22:54:57.710379 23375 net.cpp:137] Memory required for data: 4558502916 I0717 22:54:57.710405 23375 layer_factory.hpp:77] Creating layer softmax_loss I0717 22:54:57.710423 23375 net.cpp:84] Creating Layer softmax_loss I0717 22:54:57.710433 23375 net.cpp:406] softmax_loss <- fc6 I0717 22:54:57.710443 23375 net.cpp:406] softmax_loss <- label_data_1_split_1 I0717 22:54:57.710459 23375 net.cpp:380] softmax_loss -> softmax_loss I0717 22:54:57.710489 23375 layer_factory.hpp:77] Creating layer softmax_loss I0717 22:54:57.717782 23375 net.cpp:122] Setting up softmax_loss I0717 22:54:57.717809 23375 net.cpp:129] Top shape: (1) I0717 22:54:57.717816 23375 net.cpp:132] with loss weight 1 I0717 22:54:57.717862 23375 net.cpp:137] Memory required for data: 4558502920 I0717 22:54:57.717898 23375 net.cpp:198] softmax_loss needs backward computation. I0717 22:54:57.717913 23375 net.cpp:198] fc6 needs backward computation. I0717 22:54:57.717928 23375 net.cpp:198] fc5 needs backward computation. I0717 22:54:57.717936 23375 net.cpp:198] res4_3 needs backward computation. I0717 22:54:57.717942 23375 net.cpp:198] relu4_3 needs backward computation. I0717 22:54:57.717948 23375 net.cpp:198] conv4_3 needs backward computation. I0717 22:54:57.717954 23375 net.cpp:198] relu4_2 needs backward computation. I0717 22:54:57.717960 23375 net.cpp:198] conv4_2 needs backward computation. I0717 22:54:57.717968 23375 net.cpp:198] conv4_1_relu4_1_0_split needs backward computation. I0717 22:54:57.717974 23375 net.cpp:198] relu4_1 needs backward computation. I0717 22:54:57.717981 23375 net.cpp:198] conv4_1 needs backward computation. I0717 22:54:57.717988 23375 net.cpp:198] res3_9 needs backward computation. I0717 22:54:57.717994 23375 net.cpp:198] relu3_9 needs backward computation. I0717 22:54:57.718001 23375 net.cpp:198] conv3_9 needs backward computation. I0717 22:54:57.718008 23375 net.cpp:198] relu3_8 needs backward computation. I0717 22:54:57.718014 23375 net.cpp:198] conv3_8 needs backward computation. I0717 22:54:57.718022 23375 net.cpp:198] res3_7_res3_7_0_split needs backward computation. I0717 22:54:57.718029 23375 net.cpp:198] res3_7 needs backward computation. I0717 22:54:57.718036 23375 net.cpp:198] relu3_7 needs backward computation. I0717 22:54:57.718042 23375 net.cpp:198] conv3_7 needs backward computation. I0717 22:54:57.718050 23375 net.cpp:198] relu3_6 needs backward computation. I0717 22:54:57.718056 23375 net.cpp:198] conv3_6 needs backward computation. I0717 22:54:57.718062 23375 net.cpp:198] res3_5_res3_5_0_split needs backward computation. I0717 22:54:57.718070 23375 net.cpp:198] res3_5 needs backward computation. I0717 22:54:57.718076 23375 net.cpp:198] relu3_5 needs backward computation. I0717 22:54:57.718083 23375 net.cpp:198] conv3_5 needs backward computation. I0717 22:54:57.718089 23375 net.cpp:198] relu3_4 needs backward computation. I0717 22:54:57.718098 23375 net.cpp:198] conv3_4 needs backward computation. I0717 22:54:57.718104 23375 net.cpp:198] res3_3_res3_3_0_split needs backward computation. I0717 22:54:57.718111 23375 net.cpp:198] res3_3 needs backward computation. I0717 22:54:57.718118 23375 net.cpp:198] relu3_3 needs backward computation. I0717 22:54:57.718125 23375 net.cpp:198] conv3_3 needs backward computation. I0717 22:54:57.718132 23375 net.cpp:198] relu3_2 needs backward computation. I0717 22:54:57.718139 23375 net.cpp:198] conv3_2 needs backward computation. I0717 22:54:57.718153 23375 net.cpp:198] conv3_1_relu3_1_0_split needs backward computation. I0717 22:54:57.718161 23375 net.cpp:198] relu3_1 needs backward computation. I0717 22:54:57.718168 23375 net.cpp:198] conv3_1 needs backward computation. I0717 22:54:57.718176 23375 net.cpp:198] res2_5 needs backward computation. I0717 22:54:57.718184 23375 net.cpp:198] relu2_5 needs backward computation. I0717 22:54:57.718194 23375 net.cpp:198] conv2_5 needs backward computation. I0717 22:54:57.718200 23375 net.cpp:198] relu2_4 needs backward computation. I0717 22:54:57.718209 23375 net.cpp:198] conv2_4 needs backward computation. I0717 22:54:57.718217 23375 net.cpp:198] res2_3_res2_3_0_split needs backward computation. I0717 22:54:57.718226 23375 net.cpp:198] res2_3 needs backward computation. I0717 22:54:57.718233 23375 net.cpp:198] relu2_3 needs backward computation. I0717 22:54:57.718241 23375 net.cpp:198] conv2_3 needs backward computation. I0717 22:54:57.718248 23375 net.cpp:198] relu2_2 needs backward computation. I0717 22:54:57.718255 23375 net.cpp:198] conv2_2 needs backward computation. I0717 22:54:57.718263 23375 net.cpp:198] conv2_1_relu2_1_0_split needs backward computation. I0717 22:54:57.718269 23375 net.cpp:198] relu2_1 needs backward computation. I0717 22:54:57.718277 23375 net.cpp:198] conv2_1 needs backward computation. I0717 22:54:57.718286 23375 net.cpp:198] res1_3 needs backward computation. I0717 22:54:57.718307 23375 net.cpp:198] relu1_3 needs backward computation. I0717 22:54:57.718317 23375 net.cpp:198] conv1_3 needs backward computation. I0717 22:54:57.718323 23375 net.cpp:198] relu1_2 needs backward computation. I0717 22:54:57.718330 23375 net.cpp:198] conv1_2 needs backward computation. I0717 22:54:57.718338 23375 net.cpp:198] conv1_1_relu1_1_0_split needs backward computation. I0717 22:54:57.718348 23375 net.cpp:198] relu1_1 needs backward computation. I0717 22:54:57.718359 23375 net.cpp:198] conv1_1 needs backward computation. I0717 22:54:57.718367 23375 net.cpp:200] label_data_1_split does not need backward computation. I0717 22:54:57.718375 23375 net.cpp:200] data does not need backward computation. I0717 22:54:57.718381 23375 net.cpp:242] This network produces output lambda I0717 22:54:57.718389 23375 net.cpp:242] This network produces output softmax_loss I0717 22:54:57.718439 23375 net.cpp:255] Network initialization done. I0717 22:54:57.718657 23375 solver.cpp:56] Solver scaffolding done. I0717 22:54:57.721293 23375 caffe.cpp:248] Starting Optimization I0717 22:54:58.203780 23382 image_data_layer.cpp:38] Opening file data/CASIA-WebFace-112X96.txt I0717 22:54:58.212021 23382 image_data_layer.cpp:53] Shuffling data I0717 22:54:58.212898 23382 image_data_layer.cpp:63] A total of 29485 images. I0717 22:54:58.214020 23382 image_data_layer.cpp:90] output data size: 256,3,112,96 I0717 22:54:59.466454 23375 solver.cpp:272] Solving SpherefaceNet-20 I0717 22:54:59.466539 23375 solver.cpp:273] Learning Rate Policy: multistep I0717 22:55:00.315739 23375 solver.cpp:218] Iteration 0 (-1.99574e-17 iter/s, 0.828484s/100 iters), loss = 9.28011 I0717 22:55:00.315799 23375 solver.cpp:237] Train net output #0: lambda = 892.857 I0717 22:55:00.315822 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.28011 ( 1 = 9.28011 loss) I0717 22:55:00.315865 23375 sgd_solver.cpp:105] Iteration 0, lr = 0.001 I0717 22:56:29.649165 23375 solver.cpp:218] Iteration 100 (1.11938 iter/s, 89.3351s/100 iters), loss = 9.26914 I0717 22:56:29.649252 23375 solver.cpp:237] Train net output #0: lambda = 76.2195 I0717 22:56:29.649266 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.26914 ( 1 = 9.26914 loss) I0717 22:56:29.649276 23375 sgd_solver.cpp:105] Iteration 100, lr = 0.001 I0717 22:58:05.758685 23375 solver.cpp:218] Iteration 200 (1.04046 iter/s, 96.1114s/100 iters), loss = 9.34572 I0717 22:58:05.758921 23375 solver.cpp:237] Train net output #0: lambda = 39.8089 I0717 22:58:05.758956 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.34572 ( 1 = 9.34572 loss) I0717 22:58:05.758992 23375 sgd_solver.cpp:105] Iteration 200, lr = 0.001 I0717 22:59:41.488473 23375 solver.cpp:218] Iteration 300 (1.04459 iter/s, 95.7316s/100 iters), loss = 9.36941 I0717 22:59:41.488723 23375 solver.cpp:237] Train net output #0: lambda = 26.9397 I0717 22:59:41.488786 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.36941 ( 1 = 9.36941 loss) I0717 22:59:41.488835 23375 sgd_solver.cpp:105] Iteration 300, lr = 0.001 I0717 23:01:17.329685 23375 solver.cpp:218] Iteration 400 (1.04337 iter/s, 95.843s/100 iters), loss = 9.38536 I0717 23:01:17.329905 23375 solver.cpp:237] Train net output #0: lambda = 20.3583 I0717 23:01:17.329939 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.38536 ( 1 = 9.38536 loss) I0717 23:01:17.329964 23375 sgd_solver.cpp:105] Iteration 400, lr = 0.001 I0717 23:02:53.314219 23375 solver.cpp:218] Iteration 500 (1.04181 iter/s, 95.9864s/100 iters), loss = 9.41078 I0717 23:02:53.314389 23375 solver.cpp:237] Train net output #0: lambda = 16.3613 I0717 23:02:53.314409 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.41078 ( 1 = 9.41078 loss) I0717 23:02:53.314421 23375 sgd_solver.cpp:105] Iteration 500, lr = 0.001 I0717 23:04:29.698570 23375 solver.cpp:218] Iteration 600 (1.03749 iter/s, 96.3862s/100 iters), loss = 9.44096 I0717 23:04:29.698819 23375 solver.cpp:237] Train net output #0: lambda = 13.6761 I0717 23:04:29.698861 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.44096 ( 1 = 9.44096 loss) I0717 23:04:29.698894 23375 sgd_solver.cpp:105] Iteration 600, lr = 0.001 I0717 23:06:05.978163 23375 solver.cpp:218] Iteration 700 (1.03862 iter/s, 96.2814s/100 iters), loss = 9.43819 I0717 23:06:05.978345 23375 solver.cpp:237] Train net output #0: lambda = 11.7481 I0717 23:06:05.978374 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.43819 ( 1 = 9.43819 loss) I0717 23:06:05.978394 23375 sgd_solver.cpp:105] Iteration 700, lr = 0.001 I0717 23:07:42.405726 23375 solver.cpp:218] Iteration 800 (1.03703 iter/s, 96.4294s/100 iters), loss = 9.45485 I0717 23:07:42.405968 23375 solver.cpp:237] Train net output #0: lambda = 10.2965 I0717 23:07:42.406072 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.45485 ( 1 = 9.45485 loss) I0717 23:07:42.406129 23375 sgd_solver.cpp:105] Iteration 800, lr = 0.001 I0717 23:09:18.564287 23375 solver.cpp:218] Iteration 900 (1.03993 iter/s, 96.1604s/100 iters), loss = 9.46535 I0717 23:09:18.564471 23375 solver.cpp:237] Train net output #0: lambda = 9.16422 I0717 23:09:18.564502 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.46535 ( 1 = 9.46535 loss) I0717 23:09:18.564524 23375 sgd_solver.cpp:105] Iteration 900, lr = 0.001 I0717 23:10:55.007236 23375 solver.cpp:218] Iteration 1000 (1.03686 iter/s, 96.4448s/100 iters), loss = 9.46661 I0717 23:10:55.007517 23375 solver.cpp:237] Train net output #0: lambda = 8.25628 I0717 23:10:55.007619 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.46661 ( 1 = 9.46661 loss) I0717 23:10:55.007696 23375 sgd_solver.cpp:105] Iteration 1000, lr = 0.001 I0717 23:12:31.343374 23375 solver.cpp:218] Iteration 1100 (1.03801 iter/s, 96.3379s/100 iters), loss = 9.47498 I0717 23:12:31.343603 23375 solver.cpp:237] Train net output #0: lambda = 7.51202 I0717 23:12:31.343654 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47498 ( 1 = 9.47498 loss) I0717 23:12:31.343688 23375 sgd_solver.cpp:105] Iteration 1100, lr = 0.001 I0717 23:14:07.924736 23375 solver.cpp:218] Iteration 1200 (1.03538 iter/s, 96.5832s/100 iters), loss = 9.47666 I0717 23:14:07.924916 23375 solver.cpp:237] Train net output #0: lambda = 6.89085 I0717 23:14:07.924939 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47666 ( 1 = 9.47666 loss) I0717 23:14:07.924955 23375 sgd_solver.cpp:105] Iteration 1200, lr = 0.001 I0717 23:15:44.091126 23375 solver.cpp:218] Iteration 1300 (1.03984 iter/s, 96.1683s/100 iters), loss = 9.48519 I0717 23:15:44.092902 23375 solver.cpp:237] Train net output #0: lambda = 6.36456 I0717 23:15:44.092991 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48519 ( 1 = 9.48519 loss) I0717 23:15:44.093050 23375 sgd_solver.cpp:105] Iteration 1300, lr = 0.001 I0717 23:17:20.140630 23375 solver.cpp:218] Iteration 1400 (1.04113 iter/s, 96.0498s/100 iters), loss = 9.49182 I0717 23:17:20.140826 23375 solver.cpp:237] Train net output #0: lambda = 5.91296 I0717 23:17:20.140866 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49182 ( 1 = 9.49182 loss) I0717 23:17:20.140889 23375 sgd_solver.cpp:105] Iteration 1400, lr = 0.001 I0717 23:18:56.560279 23375 solver.cpp:218] Iteration 1500 (1.03711 iter/s, 96.4215s/100 iters), loss = 9.49072 I0717 23:18:56.560472 23375 solver.cpp:237] Train net output #0: lambda = 5.5212 I0717 23:18:56.560488 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49072 ( 1 = 9.49072 loss) I0717 23:18:56.560503 23375 sgd_solver.cpp:105] Iteration 1500, lr = 0.001 I0717 23:20:33.023227 23375 solver.cpp:218] Iteration 1600 (1.03665 iter/s, 96.4648s/100 iters), loss = 9.49102 I0717 23:20:33.023459 23375 solver.cpp:237] Train net output #0: lambda = 5.17813 I0717 23:20:33.023522 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49102 ( 1 = 9.49102 loss) I0717 23:20:33.023572 23375 sgd_solver.cpp:105] Iteration 1600, lr = 0.001 I0717 23:22:10.267590 23375 solver.cpp:218] Iteration 1700 (1.02832 iter/s, 97.2462s/100 iters), loss = 9.49192 I0717 23:22:10.267801 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:22:10.267817 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49192 ( 1 = 9.49192 loss) I0717 23:22:10.267828 23375 sgd_solver.cpp:105] Iteration 1700, lr = 0.001 I0717 23:23:47.751996 23375 solver.cpp:218] Iteration 1800 (1.02579 iter/s, 97.4863s/100 iters), loss = 9.49069 I0717 23:23:47.752184 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:23:47.752218 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49069 ( 1 = 9.49069 loss) I0717 23:23:47.752243 23375 sgd_solver.cpp:105] Iteration 1800, lr = 0.001 I0717 23:25:25.847000 23375 solver.cpp:218] Iteration 1900 (1.0194 iter/s, 98.0969s/100 iters), loss = 9.48877 I0717 23:25:25.847189 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:25:25.847236 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48877 ( 1 = 9.48877 loss) I0717 23:25:25.847266 23375 sgd_solver.cpp:105] Iteration 1900, lr = 0.001 I0717 23:27:04.050029 23375 solver.cpp:218] Iteration 2000 (1.01828 iter/s, 98.205s/100 iters), loss = 9.47877 I0717 23:27:04.050200 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:27:04.050226 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47877 ( 1 = 9.47877 loss) I0717 23:27:04.050246 23375 sgd_solver.cpp:105] Iteration 2000, lr = 0.001 I0717 23:28:42.602609 23375 solver.cpp:218] Iteration 2100 (1.01467 iter/s, 98.5545s/100 iters), loss = 9.47989 I0717 23:28:42.602799 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:28:42.602824 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47989 ( 1 = 9.47989 loss) I0717 23:28:42.602855 23375 sgd_solver.cpp:105] Iteration 2100, lr = 0.001 I0717 23:30:21.070199 23375 solver.cpp:218] Iteration 2200 (1.01554 iter/s, 98.4695s/100 iters), loss = 9.47797 I0717 23:30:21.070390 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:30:21.070408 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47797 ( 1 = 9.47797 loss) I0717 23:30:21.070422 23375 sgd_solver.cpp:105] Iteration 2200, lr = 0.001 I0717 23:32:00.421607 23375 solver.cpp:218] Iteration 2300 (1.00651 iter/s, 99.3533s/100 iters), loss = 9.48282 I0717 23:32:00.421826 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:32:00.421886 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48282 ( 1 = 9.48282 loss) I0717 23:32:00.421928 23375 sgd_solver.cpp:105] Iteration 2300, lr = 0.001 I0717 23:33:39.630357 23375 solver.cpp:218] Iteration 2400 (1.00796 iter/s, 99.2107s/100 iters), loss = 9.4765 I0717 23:33:39.630556 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:33:39.630595 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.4765 ( 1 = 9.4765 loss) I0717 23:33:39.630620 23375 sgd_solver.cpp:105] Iteration 2400, lr = 0.001 I0717 23:35:19.253130 23375 solver.cpp:218] Iteration 2500 (1.00377 iter/s, 99.6247s/100 iters), loss = 9.48595 I0717 23:35:19.253361 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:35:19.253463 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48595 ( 1 = 9.48595 loss) I0717 23:35:19.253521 23375 sgd_solver.cpp:105] Iteration 2500, lr = 0.001 I0717 23:36:58.723839 23375 solver.cpp:218] Iteration 2600 (1.0053 iter/s, 99.4726s/100 iters), loss = 9.47591 I0717 23:36:58.724123 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:36:58.724175 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47591 ( 1 = 9.47591 loss) I0717 23:36:58.724213 23375 sgd_solver.cpp:105] Iteration 2600, lr = 0.001 I0717 23:38:37.742362 23375 solver.cpp:218] Iteration 2700 (1.00989 iter/s, 99.0204s/100 iters), loss = 9.48644 I0717 23:38:37.742578 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:38:37.742678 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48644 ( 1 = 9.48644 loss) I0717 23:38:37.742763 23375 sgd_solver.cpp:105] Iteration 2700, lr = 0.001 I0717 23:40:17.476645 23375 solver.cpp:218] Iteration 2800 (1.00264 iter/s, 99.7362s/100 iters), loss = 9.47737 I0717 23:40:17.476919 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:40:17.476970 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47737 ( 1 = 9.47737 loss) I0717 23:40:17.477005 23375 sgd_solver.cpp:105] Iteration 2800, lr = 0.001 I0717 23:41:57.371743 23375 solver.cpp:218] Iteration 2900 (1.00103 iter/s, 99.897s/100 iters), loss = 9.47615 I0717 23:41:57.371912 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:41:57.371934 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47615 ( 1 = 9.47615 loss) I0717 23:41:57.371953 23375 sgd_solver.cpp:105] Iteration 2900, lr = 0.001 I0717 23:43:37.310703 23375 solver.cpp:218] Iteration 3000 (1.00059 iter/s, 99.9409s/100 iters), loss = 9.48168 I0717 23:43:37.310974 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:43:37.311077 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48168 ( 1 = 9.48168 loss) I0717 23:43:37.311142 23375 sgd_solver.cpp:105] Iteration 3000, lr = 0.001 I0717 23:45:17.035151 23375 solver.cpp:218] Iteration 3100 (1.00274 iter/s, 99.7263s/100 iters), loss = 9.47358 I0717 23:45:17.035367 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:45:17.035398 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.47358 ( 1 = 9.47358 loss) I0717 23:45:17.035423 23375 sgd_solver.cpp:105] Iteration 3100, lr = 0.001 I0717 23:46:56.569387 23375 solver.cpp:218] Iteration 3200 (1.00466 iter/s, 99.5362s/100 iters), loss = 9.48637 I0717 23:46:56.569577 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:46:56.569599 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48637 ( 1 = 9.48637 loss) I0717 23:46:56.569615 23375 sgd_solver.cpp:105] Iteration 3200, lr = 0.001 I0717 23:48:36.361135 23375 solver.cpp:218] Iteration 3300 (1.00207 iter/s, 99.7937s/100 iters), loss = 9.48263 I0717 23:48:36.361359 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:48:36.361464 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48263 ( 1 = 9.48263 loss) I0717 23:48:36.361524 23375 sgd_solver.cpp:105] Iteration 3300, lr = 0.001 I0717 23:50:16.594089 23375 solver.cpp:218] Iteration 3400 (0.997657 iter/s, 100.235s/100 iters), loss = 9.4873 I0717 23:50:16.594285 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:50:16.594305 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.4873 ( 1 = 9.4873 loss) I0717 23:50:16.594324 23375 sgd_solver.cpp:105] Iteration 3400, lr = 0.001 I0717 23:51:56.325964 23375 solver.cpp:218] Iteration 3500 (1.00267 iter/s, 99.7338s/100 iters), loss = 9.48688 I0717 23:51:56.326165 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:51:56.326186 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48688 ( 1 = 9.48688 loss) I0717 23:51:56.326202 23375 sgd_solver.cpp:105] Iteration 3500, lr = 0.001 I0717 23:53:35.833215 23375 solver.cpp:218] Iteration 3600 (1.00493 iter/s, 99.5092s/100 iters), loss = 9.48658 I0717 23:53:35.833441 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:53:35.833540 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48658 ( 1 = 9.48658 loss) I0717 23:53:35.833611 23375 sgd_solver.cpp:105] Iteration 3600, lr = 0.001 I0717 23:55:15.809948 23375 solver.cpp:218] Iteration 3700 (1.00021 iter/s, 99.9787s/100 iters), loss = 9.49667 I0717 23:55:15.810130 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:55:15.810153 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49667 ( 1 = 9.49667 loss) I0717 23:55:15.810169 23375 sgd_solver.cpp:105] Iteration 3700, lr = 0.001 I0717 23:56:55.783293 23375 solver.cpp:218] Iteration 3800 (1.00025 iter/s, 99.9753s/100 iters), loss = 9.48145 I0717 23:56:55.783552 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:56:55.783577 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48145 ( 1 = 9.48145 loss) I0717 23:56:55.783594 23375 sgd_solver.cpp:105] Iteration 3800, lr = 0.001 I0717 23:58:35.957551 23375 solver.cpp:218] Iteration 3900 (0.998242 iter/s, 100.176s/100 iters), loss = 9.49019 I0717 23:58:35.958983 23375 solver.cpp:237] Train net output #0: lambda = 5 I0717 23:58:35.959013 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49019 ( 1 = 9.49019 loss) I0717 23:58:35.959036 23375 sgd_solver.cpp:105] Iteration 3900, lr = 0.001 I0718 00:00:16.035022 23375 solver.cpp:218] Iteration 4000 (0.999219 iter/s, 100.078s/100 iters), loss = 9.49466 I0718 00:00:16.035202 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:00:16.035239 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49466 ( 1 = 9.49466 loss) I0718 00:00:16.035269 23375 sgd_solver.cpp:105] Iteration 4000, lr = 0.001 I0718 00:01:55.931653 23375 solver.cpp:218] Iteration 4100 (1.00102 iter/s, 99.8986s/100 iters), loss = 9.49888 I0718 00:01:55.931844 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:01:55.931907 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49888 ( 1 = 9.49888 loss) I0718 00:01:55.931946 23375 sgd_solver.cpp:105] Iteration 4100, lr = 0.001 I0718 00:03:36.588934 23375 solver.cpp:218] Iteration 4200 (0.993451 iter/s, 100.659s/100 iters), loss = 9.48687 I0718 00:03:36.589051 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:03:36.589066 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.48687 ( 1 = 9.48687 loss) I0718 00:03:36.589077 23375 sgd_solver.cpp:105] Iteration 4200, lr = 0.001 I0718 00:05:16.774986 23375 solver.cpp:218] Iteration 4300 (0.998123 iter/s, 100.188s/100 iters), loss = 9.50055 I0718 00:05:16.775229 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:05:16.775264 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50055 ( 1 = 9.50055 loss) I0718 00:05:16.775287 23375 sgd_solver.cpp:105] Iteration 4300, lr = 0.001 I0718 00:06:57.246660 23375 solver.cpp:218] Iteration 4400 (0.995286 iter/s, 100.474s/100 iters), loss = 9.49493 I0718 00:06:57.246865 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:06:57.246923 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49493 ( 1 = 9.49493 loss) I0718 00:06:57.246968 23375 sgd_solver.cpp:105] Iteration 4400, lr = 0.001 I0718 00:08:37.206856 23375 solver.cpp:218] Iteration 4500 (1.00038 iter/s, 99.9621s/100 iters), loss = 9.50341 I0718 00:08:37.207090 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:08:37.207141 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50341 ( 1 = 9.50341 loss) I0718 00:08:37.207175 23375 sgd_solver.cpp:105] Iteration 4500, lr = 0.001 I0718 00:10:17.330324 23375 solver.cpp:218] Iteration 4600 (0.998748 iter/s, 100.125s/100 iters), loss = 9.49508 I0718 00:10:17.330515 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:10:17.330541 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.49508 ( 1 = 9.49508 loss) I0718 00:10:17.330559 23375 sgd_solver.cpp:105] Iteration 4600, lr = 0.001 I0718 00:11:57.325501 23375 solver.cpp:218] Iteration 4700 (1.00003 iter/s, 99.9971s/100 iters), loss = 9.50086 I0718 00:11:57.332396 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:11:57.332412 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50086 ( 1 = 9.50086 loss) I0718 00:11:57.332442 23375 sgd_solver.cpp:105] Iteration 4700, lr = 0.001 I0718 00:13:37.620862 23375 solver.cpp:218] Iteration 4800 (0.997102 iter/s, 100.291s/100 iters), loss = 9.50207 I0718 00:13:37.621071 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:13:37.621165 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50207 ( 1 = 9.50207 loss) I0718 00:13:37.621227 23375 sgd_solver.cpp:105] Iteration 4800, lr = 0.001 I0718 00:15:18.006212 23375 solver.cpp:218] Iteration 4900 (0.996142 iter/s, 100.387s/100 iters), loss = 9.50712 I0718 00:15:18.006400 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:15:18.006418 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50712 ( 1 = 9.50712 loss) I0718 00:15:18.006429 23375 sgd_solver.cpp:105] Iteration 4900, lr = 0.001 I0718 00:16:58.357555 23375 solver.cpp:218] Iteration 5000 (0.996479 iter/s, 100.353s/100 iters), loss = 9.51641 I0718 00:16:58.357750 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:16:58.357771 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.51641 ( 1 = 9.51641 loss) I0718 00:16:58.357787 23375 sgd_solver.cpp:105] Iteration 5000, lr = 0.001 I0718 00:18:38.703711 23375 solver.cpp:218] Iteration 5100 (0.996531 iter/s, 100.348s/100 iters), loss = 9.5086 I0718 00:18:38.703933 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:18:38.703989 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5086 ( 1 = 9.5086 loss) I0718 00:18:38.704028 23375 sgd_solver.cpp:105] Iteration 5100, lr = 0.001 I0718 00:20:19.029466 23375 solver.cpp:218] Iteration 5200 (0.996734 iter/s, 100.328s/100 iters), loss = 9.50929 I0718 00:20:19.029696 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:20:19.029747 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50929 ( 1 = 9.50929 loss) I0718 00:20:19.029779 23375 sgd_solver.cpp:105] Iteration 5200, lr = 0.001 I0718 00:21:59.355078 23375 solver.cpp:218] Iteration 5300 (0.996735 iter/s, 100.328s/100 iters), loss = 9.5171 I0718 00:21:59.355289 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:21:59.355316 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5171 ( 1 = 9.5171 loss) I0718 00:21:59.355337 23375 sgd_solver.cpp:105] Iteration 5300, lr = 0.001 I0718 00:23:39.878767 23375 solver.cpp:218] Iteration 5400 (0.994771 iter/s, 100.526s/100 iters), loss = 9.50781 I0718 00:23:39.878928 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:23:39.878952 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.50781 ( 1 = 9.50781 loss) I0718 00:23:39.878973 23375 sgd_solver.cpp:105] Iteration 5400, lr = 0.001 I0718 00:25:20.258630 23375 solver.cpp:218] Iteration 5500 (0.996196 iter/s, 100.382s/100 iters), loss = 9.51834 I0718 00:25:20.258905 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:25:20.258970 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.51834 ( 1 = 9.51834 loss) I0718 00:25:20.259016 23375 sgd_solver.cpp:105] Iteration 5500, lr = 0.001 I0718 00:27:00.811462 23375 solver.cpp:218] Iteration 5600 (0.994483 iter/s, 100.555s/100 iters), loss = 9.51766 I0718 00:27:00.811643 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:27:00.811693 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.51766 ( 1 = 9.51766 loss) I0718 00:27:00.811724 23375 sgd_solver.cpp:105] Iteration 5600, lr = 0.001 I0718 00:28:41.455389 23375 solver.cpp:218] Iteration 5700 (0.993582 iter/s, 100.646s/100 iters), loss = 9.51255 I0718 00:28:41.455567 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:28:41.455591 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.51255 ( 1 = 9.51255 loss) I0718 00:28:41.455608 23375 sgd_solver.cpp:105] Iteration 5700, lr = 0.001 I0718 00:30:22.007594 23375 solver.cpp:218] Iteration 5800 (0.994489 iter/s, 100.554s/100 iters), loss = 9.52979 I0718 00:30:22.007823 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:30:22.007858 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.52979 ( 1 = 9.52979 loss) I0718 00:30:22.007879 23375 sgd_solver.cpp:105] Iteration 5800, lr = 0.001 I0718 00:32:02.633090 23375 solver.cpp:218] Iteration 5900 (0.993765 iter/s, 100.627s/100 iters), loss = 9.52491 I0718 00:32:02.636399 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:32:02.636416 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.52491 ( 1 = 9.52491 loss) I0718 00:32:02.636431 23375 sgd_solver.cpp:105] Iteration 5900, lr = 0.001 I0718 00:33:42.879966 23375 solver.cpp:218] Iteration 6000 (0.997549 iter/s, 100.246s/100 iters), loss = 9.52558 I0718 00:33:42.880300 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:33:42.880367 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.52558 ( 1 = 9.52558 loss) I0718 00:33:42.880401 23375 sgd_solver.cpp:105] Iteration 6000, lr = 0.001 I0718 00:35:23.213724 23375 solver.cpp:218] Iteration 6100 (0.996655 iter/s, 100.336s/100 iters), loss = 9.5178 I0718 00:35:23.213938 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:35:23.213970 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5178 ( 1 = 9.5178 loss) I0718 00:35:23.214004 23375 sgd_solver.cpp:105] Iteration 6100, lr = 0.001 I0718 00:37:03.763327 23375 solver.cpp:218] Iteration 6200 (0.994515 iter/s, 100.552s/100 iters), loss = 9.53147 I0718 00:37:03.763550 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:37:03.763587 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.53147 ( 1 = 9.53147 loss) I0718 00:37:03.763612 23375 sgd_solver.cpp:105] Iteration 6200, lr = 0.001 I0718 00:38:44.126955 23375 solver.cpp:218] Iteration 6300 (0.996358 iter/s, 100.366s/100 iters), loss = 9.52675 I0718 00:38:44.127167 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:38:44.127199 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.52675 ( 1 = 9.52675 loss) I0718 00:38:44.127230 23375 sgd_solver.cpp:105] Iteration 6300, lr = 0.001 I0718 00:40:24.711383 23375 solver.cpp:218] Iteration 6400 (0.99417 iter/s, 100.586s/100 iters), loss = 9.5218 I0718 00:40:24.711573 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:40:24.711599 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5218 ( 1 = 9.5218 loss) I0718 00:40:24.711627 23375 sgd_solver.cpp:105] Iteration 6400, lr = 0.001 I0718 00:42:05.129433 23375 solver.cpp:218] Iteration 6500 (0.995817 iter/s, 100.42s/100 iters), loss = 9.5238 I0718 00:42:05.129580 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:42:05.129595 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5238 ( 1 = 9.5238 loss) I0718 00:42:05.129606 23375 sgd_solver.cpp:105] Iteration 6500, lr = 0.001 I0718 00:43:45.531569 23375 solver.cpp:218] Iteration 6600 (0.995975 iter/s, 100.404s/100 iters), loss = 9.53495 I0718 00:43:45.531738 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:43:45.531752 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.53495 ( 1 = 9.53495 loss) I0718 00:43:45.531771 23375 sgd_solver.cpp:105] Iteration 6600, lr = 0.001 I0718 00:45:25.769919 23375 solver.cpp:218] Iteration 6700 (0.997602 iter/s, 100.24s/100 iters), loss = 9.55249 I0718 00:45:25.770103 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:45:25.770141 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.55249 ( 1 = 9.55249 loss) I0718 00:45:25.770170 23375 sgd_solver.cpp:105] Iteration 6700, lr = 0.001 I0718 00:47:06.327023 23375 solver.cpp:218] Iteration 6800 (0.99444 iter/s, 100.559s/100 iters), loss = 9.54259 I0718 00:47:06.327201 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:47:06.327221 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.54259 ( 1 = 9.54259 loss) I0718 00:47:06.327234 23375 sgd_solver.cpp:105] Iteration 6800, lr = 0.001 I0718 00:48:46.818317 23375 solver.cpp:218] Iteration 6900 (0.995092 iter/s, 100.493s/100 iters), loss = 9.55164 I0718 00:48:46.818508 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:48:46.818542 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.55164 ( 1 = 9.55164 loss) I0718 00:48:46.818560 23375 sgd_solver.cpp:105] Iteration 6900, lr = 0.001 I0718 00:50:27.752084 23375 solver.cpp:218] Iteration 7000 (0.990729 iter/s, 100.936s/100 iters), loss = 9.52944 I0718 00:50:27.752295 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:50:27.752365 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.52944 ( 1 = 9.52944 loss) I0718 00:50:27.752398 23375 sgd_solver.cpp:105] Iteration 7000, lr = 0.001 I0718 00:52:08.207271 23375 solver.cpp:218] Iteration 7100 (0.995449 iter/s, 100.457s/100 iters), loss = 9.55609 I0718 00:52:08.207451 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:52:08.207468 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.55609 ( 1 = 9.55609 loss) I0718 00:52:08.207480 23375 sgd_solver.cpp:105] Iteration 7100, lr = 0.001 I0718 00:53:48.729557 23375 solver.cpp:218] Iteration 7200 (0.994785 iter/s, 100.524s/100 iters), loss = 9.54575 I0718 00:53:48.729765 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:53:48.729805 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.54575 ( 1 = 9.54575 loss) I0718 00:53:48.729826 23375 sgd_solver.cpp:105] Iteration 7200, lr = 0.001 I0718 00:55:29.914824 23375 solver.cpp:218] Iteration 7300 (0.988267 iter/s, 101.187s/100 iters), loss = 9.54339 I0718 00:55:29.914988 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:55:29.915020 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.54339 ( 1 = 9.54339 loss) I0718 00:55:29.915036 23375 sgd_solver.cpp:105] Iteration 7300, lr = 0.001 I0718 00:57:10.225174 23375 solver.cpp:218] Iteration 7400 (0.996886 iter/s, 100.312s/100 iters), loss = 9.54715 I0718 00:57:10.225352 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:57:10.225370 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.54715 ( 1 = 9.54715 loss) I0718 00:57:10.225383 23375 sgd_solver.cpp:105] Iteration 7400, lr = 0.001 I0718 00:58:50.597375 23375 solver.cpp:218] Iteration 7500 (0.996272 iter/s, 100.374s/100 iters), loss = 9.5382 I0718 00:58:50.597594 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 00:58:50.597657 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5382 ( 1 = 9.5382 loss) I0718 00:58:50.597707 23375 sgd_solver.cpp:105] Iteration 7500, lr = 0.001 I0718 01:00:31.547341 23375 solver.cpp:218] Iteration 7600 (0.99057 iter/s, 100.952s/100 iters), loss = 9.56639 I0718 01:00:31.547561 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:00:31.547662 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.56639 ( 1 = 9.56639 loss) I0718 01:00:31.547736 23375 sgd_solver.cpp:105] Iteration 7600, lr = 0.001 I0718 01:02:12.075420 23375 solver.cpp:218] Iteration 7700 (0.994728 iter/s, 100.53s/100 iters), loss = 9.56359 I0718 01:02:12.075592 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:02:12.075630 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.56359 ( 1 = 9.56359 loss) I0718 01:02:12.075673 23375 sgd_solver.cpp:105] Iteration 7700, lr = 0.001 I0718 01:03:52.690448 23375 solver.cpp:218] Iteration 7800 (0.993868 iter/s, 100.617s/100 iters), loss = 9.55839 I0718 01:03:52.690675 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:03:52.690707 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.55839 ( 1 = 9.55839 loss) I0718 01:03:52.690729 23375 sgd_solver.cpp:105] Iteration 7800, lr = 0.001 I0718 01:05:33.400193 23375 solver.cpp:218] Iteration 7900 (0.992933 iter/s, 100.712s/100 iters), loss = 9.56858 I0718 01:05:33.400396 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:05:33.400434 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.56858 ( 1 = 9.56858 loss) I0718 01:05:33.400460 23375 sgd_solver.cpp:105] Iteration 7900, lr = 0.001 I0718 01:07:13.816089 23375 solver.cpp:218] Iteration 8000 (0.995839 iter/s, 100.418s/100 iters), loss = 9.54795 I0718 01:07:13.816309 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:07:13.816380 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.54795 ( 1 = 9.54795 loss) I0718 01:07:13.816418 23375 sgd_solver.cpp:105] Iteration 8000, lr = 0.001 I0718 01:08:54.593657 23375 solver.cpp:218] Iteration 8100 (0.992265 iter/s, 100.78s/100 iters), loss = 9.55898 I0718 01:08:54.593909 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:08:54.593933 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.55898 ( 1 = 9.55898 loss) I0718 01:08:54.593955 23375 sgd_solver.cpp:105] Iteration 8100, lr = 0.001 I0718 01:10:35.152470 23375 solver.cpp:218] Iteration 8200 (0.994424 iter/s, 100.561s/100 iters), loss = 9.5747 I0718 01:10:35.152688 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:10:35.152711 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5747 ( 1 = 9.5747 loss) I0718 01:10:35.152729 23375 sgd_solver.cpp:105] Iteration 8200, lr = 0.001 I0718 01:12:16.430047 23375 solver.cpp:218] Iteration 8300 (0.987366 iter/s, 101.28s/100 iters), loss = 9.56932 I0718 01:12:16.430253 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:12:16.430279 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.56932 ( 1 = 9.56932 loss) I0718 01:12:16.430296 23375 sgd_solver.cpp:105] Iteration 8300, lr = 0.001 I0718 01:13:57.841338 23375 solver.cpp:218] Iteration 8400 (0.986064 iter/s, 101.413s/100 iters), loss = 9.55695 I0718 01:13:57.841531 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:13:57.841563 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.55695 ( 1 = 9.55695 loss) I0718 01:13:57.841599 23375 sgd_solver.cpp:105] Iteration 8400, lr = 0.001 I0718 01:15:38.419020 23375 solver.cpp:218] Iteration 8500 (0.994237 iter/s, 100.58s/100 iters), loss = 9.57992 I0718 01:15:38.419250 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:15:38.419345 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.57992 ( 1 = 9.57992 loss) I0718 01:15:38.419409 23375 sgd_solver.cpp:105] Iteration 8500, lr = 0.001 I0718 01:17:19.383666 23375 solver.cpp:218] Iteration 8600 (0.990427 iter/s, 100.967s/100 iters), loss = 9.56554 I0718 01:17:19.383877 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:17:19.383909 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.56554 ( 1 = 9.56554 loss) I0718 01:17:19.383934 23375 sgd_solver.cpp:105] Iteration 8600, lr = 0.001 I0718 01:19:00.598119 23375 solver.cpp:218] Iteration 8700 (0.987982 iter/s, 101.216s/100 iters), loss = 9.5877 I0718 01:19:00.598275 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:19:00.598296 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.5877 ( 1 = 9.5877 loss) I0718 01:19:00.598316 23375 sgd_solver.cpp:105] Iteration 8700, lr = 0.001 I0718 01:20:41.311868 23375 solver.cpp:218] Iteration 8800 (0.992893 iter/s, 100.716s/100 iters), loss = 9.58056 I0718 01:20:41.312000 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:20:41.312019 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.58056 ( 1 = 9.58056 loss) I0718 01:20:41.312041 23375 sgd_solver.cpp:105] Iteration 8800, lr = 0.001 I0718 01:22:22.032943 23375 solver.cpp:218] Iteration 8900 (0.992821 iter/s, 100.723s/100 iters), loss = 9.59353 I0718 01:22:22.033154 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:22:22.033219 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.59353 ( 1 = 9.59353 loss) I0718 01:22:22.033267 23375 sgd_solver.cpp:105] Iteration 8900, lr = 0.001 I0718 01:24:03.039510 23375 solver.cpp:218] Iteration 9000 (0.990015 iter/s, 101.009s/100 iters), loss = 9.59018 I0718 01:24:03.039672 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:24:03.039695 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.59018 ( 1 = 9.59018 loss) I0718 01:24:03.039722 23375 sgd_solver.cpp:105] Iteration 9000, lr = 0.001 I0718 01:25:43.787031 23375 solver.cpp:218] Iteration 9100 (0.992561 iter/s, 100.75s/100 iters), loss = 9.59735 I0718 01:25:43.787191 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:25:43.787206 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.59735 ( 1 = 9.59735 loss) I0718 01:25:43.787217 23375 sgd_solver.cpp:105] Iteration 9100, lr = 0.001 I0718 01:27:24.981426 23375 solver.cpp:218] Iteration 9200 (0.988177 iter/s, 101.196s/100 iters), loss = 9.6094 I0718 01:27:24.981668 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:27:24.981731 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.6094 ( 1 = 9.6094 loss) I0718 01:27:24.981793 23375 sgd_solver.cpp:105] Iteration 9200, lr = 0.001 I0718 01:29:06.123481 23375 solver.cpp:218] Iteration 9300 (0.988689 iter/s, 101.144s/100 iters), loss = 9.59759 I0718 01:29:06.123719 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:29:06.123787 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.59759 ( 1 = 9.59759 loss) I0718 01:29:06.123833 23375 sgd_solver.cpp:105] Iteration 9300, lr = 0.001 I0718 01:30:47.224764 23375 solver.cpp:218] Iteration 9400 (0.989088 iter/s, 101.103s/100 iters), loss = 9.624 I0718 01:30:47.224983 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:30:47.225083 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.624 ( 1 = 9.624 loss) I0718 01:30:47.225152 23375 sgd_solver.cpp:105] Iteration 9400, lr = 0.001 I0718 01:32:28.338583 23375 solver.cpp:218] Iteration 9500 (0.988965 iter/s, 101.116s/100 iters), loss = 9.60335 I0718 01:32:28.339771 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:32:28.339797 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.60335 ( 1 = 9.60335 loss) I0718 01:32:28.339813 23375 sgd_solver.cpp:105] Iteration 9500, lr = 0.001 I0718 01:34:09.695852 23375 solver.cpp:218] Iteration 9600 (0.986599 iter/s, 101.358s/100 iters), loss = 9.59501 I0718 01:34:09.696063 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:34:09.696162 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.59501 ( 1 = 9.59501 loss) I0718 01:34:09.696235 23375 sgd_solver.cpp:105] Iteration 9600, lr = 0.001 I0718 01:35:50.598244 23375 solver.cpp:218] Iteration 9700 (0.991037 iter/s, 100.904s/100 iters), loss = 9.61592 I0718 01:35:50.598482 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:35:50.598534 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.61592 ( 1 = 9.61592 loss) I0718 01:35:50.598567 23375 sgd_solver.cpp:105] Iteration 9700, lr = 0.001 I0718 01:37:31.993999 23375 solver.cpp:218] Iteration 9800 (0.986216 iter/s, 101.398s/100 iters), loss = 9.61773 I0718 01:37:31.994256 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:37:31.994319 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.61773 ( 1 = 9.61773 loss) I0718 01:37:31.994367 23375 sgd_solver.cpp:105] Iteration 9800, lr = 0.001 I0718 01:39:13.163355 23375 solver.cpp:218] Iteration 9900 (0.988423 iter/s, 101.171s/100 iters), loss = 9.63601 I0718 01:39:13.163547 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:39:13.163581 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.63601 ( 1 = 9.63601 loss) I0718 01:39:13.163610 23375 sgd_solver.cpp:105] Iteration 9900, lr = 0.001 I0718 01:40:53.986366 23375 solver.cpp:218] Iteration 10000 (0.991818 iter/s, 100.825s/100 iters), loss = 9.6269 I0718 01:40:53.986589 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:40:53.986654 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.6269 ( 1 = 9.6269 loss) I0718 01:40:53.986703 23375 sgd_solver.cpp:105] Iteration 10000, lr = 0.001 I0718 01:42:34.983361 23375 solver.cpp:218] Iteration 10100 (0.990109 iter/s, 100.999s/100 iters), loss = 9.62735 I0718 01:42:34.983563 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:42:34.983585 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.62735 ( 1 = 9.62735 loss) I0718 01:42:34.983603 23375 sgd_solver.cpp:105] Iteration 10100, lr = 0.001 I0718 01:44:16.297634 23375 solver.cpp:218] Iteration 10200 (0.987009 iter/s, 101.316s/100 iters), loss = 9.6151 I0718 01:44:16.297828 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:44:16.297873 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.6151 ( 1 = 9.6151 loss) I0718 01:44:16.297904 23375 sgd_solver.cpp:105] Iteration 10200, lr = 0.001 I0718 01:45:57.275441 23375 solver.cpp:218] Iteration 10300 (0.990298 iter/s, 100.98s/100 iters), loss = 9.6494 I0718 01:45:57.275616 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:45:57.275632 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.6494 ( 1 = 9.6494 loss) I0718 01:45:57.275655 23375 sgd_solver.cpp:105] Iteration 10300, lr = 0.001 I0718 01:47:38.138640 23375 solver.cpp:218] Iteration 10400 (0.991422 iter/s, 100.865s/100 iters), loss = 9.62177 I0718 01:47:38.138869 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:47:38.138921 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.62177 ( 1 = 9.62177 loss) I0718 01:47:38.138954 23375 sgd_solver.cpp:105] Iteration 10400, lr = 0.001 I0718 01:49:19.323482 23375 solver.cpp:218] Iteration 10500 (0.988271 iter/s, 101.187s/100 iters), loss = 9.61187 I0718 01:49:19.323706 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:49:19.323757 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.61187 ( 1 = 9.61187 loss) I0718 01:49:19.323789 23375 sgd_solver.cpp:105] Iteration 10500, lr = 0.001 I0718 01:51:00.506584 23375 solver.cpp:218] Iteration 10600 (0.988288 iter/s, 101.185s/100 iters), loss = 9.64353 I0718 01:51:00.506723 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:51:00.506739 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.64353 ( 1 = 9.64353 loss) I0718 01:51:00.506752 23375 sgd_solver.cpp:105] Iteration 10600, lr = 0.001 I0718 01:52:41.715276 23375 solver.cpp:218] Iteration 10700 (0.988038 iter/s, 101.211s/100 iters), loss = 9.65135 I0718 01:52:41.715502 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:52:41.715602 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.65135 ( 1 = 9.65135 loss) I0718 01:52:41.715677 23375 sgd_solver.cpp:105] Iteration 10700, lr = 0.001 I0718 01:54:22.478708 23375 solver.cpp:218] Iteration 10800 (0.992404 iter/s, 100.765s/100 iters), loss = 9.61249 I0718 01:54:22.478956 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:54:22.479012 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.61249 ( 1 = 9.61249 loss) I0718 01:54:22.479051 23375 sgd_solver.cpp:105] Iteration 10800, lr = 0.001 I0718 01:56:03.490126 23375 solver.cpp:218] Iteration 10900 (0.989968 iter/s, 101.013s/100 iters), loss = 9.65992 I0718 01:56:03.490305 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:56:03.490334 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.65992 ( 1 = 9.65992 loss) I0718 01:56:03.490353 23375 sgd_solver.cpp:105] Iteration 10900, lr = 0.001 I0718 01:57:44.796871 23375 solver.cpp:218] Iteration 11000 (0.987082 iter/s, 101.309s/100 iters), loss = 9.65611 I0718 01:57:44.797101 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:57:44.797204 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.65611 ( 1 = 9.65611 loss) I0718 01:57:44.797277 23375 sgd_solver.cpp:105] Iteration 11000, lr = 0.001 I0718 01:59:26.008668 23375 solver.cpp:218] Iteration 11100 (0.988008 iter/s, 101.214s/100 iters), loss = 9.64895 I0718 01:59:26.008852 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 01:59:26.008874 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.64895 ( 1 = 9.64895 loss) I0718 01:59:26.008890 23375 sgd_solver.cpp:105] Iteration 11100, lr = 0.001 I0718 02:01:07.147578 23375 solver.cpp:218] Iteration 11200 (0.98872 iter/s, 101.141s/100 iters), loss = 9.67362 I0718 02:01:07.147811 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:01:07.147862 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.67362 ( 1 = 9.67362 loss) I0718 02:01:07.147894 23375 sgd_solver.cpp:105] Iteration 11200, lr = 0.001 I0718 02:02:48.036325 23375 solver.cpp:218] Iteration 11300 (0.991172 iter/s, 100.891s/100 iters), loss = 9.65906 I0718 02:02:48.039923 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:02:48.039995 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.65906 ( 1 = 9.65906 loss) I0718 02:02:48.040053 23375 sgd_solver.cpp:105] Iteration 11300, lr = 0.001 I0718 02:04:29.055425 23375 solver.cpp:218] Iteration 11400 (0.989926 iter/s, 101.018s/100 iters), loss = 9.6886 I0718 02:04:29.060425 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:04:29.060458 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.6886 ( 1 = 9.6886 loss) I0718 02:04:29.060478 23375 sgd_solver.cpp:105] Iteration 11400, lr = 0.001 I0718 02:06:09.476619 23375 solver.cpp:218] Iteration 11500 (0.995834 iter/s, 100.418s/100 iters), loss = 9.70347 I0718 02:06:09.476863 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:06:09.476920 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.70347 ( 1 = 9.70347 loss) I0718 02:06:09.476963 23375 sgd_solver.cpp:105] Iteration 11500, lr = 0.001 I0718 02:07:50.586043 23375 solver.cpp:218] Iteration 11600 (0.989009 iter/s, 101.111s/100 iters), loss = 9.6619 I0718 02:07:50.586215 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:07:50.586249 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.6619 ( 1 = 9.6619 loss) I0718 02:07:50.586267 23375 sgd_solver.cpp:105] Iteration 11600, lr = 0.001 I0718 02:09:31.547524 23375 solver.cpp:218] Iteration 11700 (0.990457 iter/s, 100.963s/100 iters), loss = 9.66946 I0718 02:09:31.547735 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:09:31.547802 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.66946 ( 1 = 9.66946 loss) I0718 02:09:31.547848 23375 sgd_solver.cpp:105] Iteration 11700, lr = 0.001 I0718 02:11:12.518673 23375 solver.cpp:218] Iteration 11800 (0.990363 iter/s, 100.973s/100 iters), loss = 9.75233 I0718 02:11:12.518879 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:11:12.518913 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.75233 ( 1 = 9.75233 loss) I0718 02:11:12.518936 23375 sgd_solver.cpp:105] Iteration 11800, lr = 0.001 I0718 02:12:53.521211 23375 solver.cpp:218] Iteration 11900 (0.990055 iter/s, 101.005s/100 iters), loss = 9.70765 I0718 02:12:53.524392 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:12:53.524417 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.70765 ( 1 = 9.70765 loss) I0718 02:12:53.524435 23375 sgd_solver.cpp:105] Iteration 11900, lr = 0.001 I0718 02:14:34.693346 23375 solver.cpp:218] Iteration 12000 (0.988424 iter/s, 101.171s/100 iters), loss = 9.7229 I0718 02:14:34.693536 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:14:34.693573 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.7229 ( 1 = 9.7229 loss) I0718 02:14:34.693606 23375 sgd_solver.cpp:105] Iteration 12000, lr = 0.001 I0718 02:16:15.833027 23375 solver.cpp:218] Iteration 12100 (0.988712 iter/s, 101.142s/100 iters), loss = 9.70516 I0718 02:16:15.833204 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:16:15.833230 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.70516 ( 1 = 9.70516 loss) I0718 02:16:15.833247 23375 sgd_solver.cpp:105] Iteration 12100, lr = 0.001 I0718 02:17:57.059334 23375 solver.cpp:218] Iteration 12200 (0.987866 iter/s, 101.228s/100 iters), loss = 9.77852 I0718 02:17:57.059520 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:17:57.059554 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.77852 ( 1 = 9.77852 loss) I0718 02:17:57.059576 23375 sgd_solver.cpp:105] Iteration 12200, lr = 0.001 I0718 02:19:38.427109 23375 solver.cpp:218] Iteration 12300 (0.986487 iter/s, 101.37s/100 iters), loss = 9.68515 I0718 02:19:38.427276 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:19:38.427292 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.68515 ( 1 = 9.68515 loss) I0718 02:19:38.427302 23375 sgd_solver.cpp:105] Iteration 12300, lr = 0.001 I0718 02:21:19.691793 23375 solver.cpp:218] Iteration 12400 (0.987492 iter/s, 101.267s/100 iters), loss = 9.71064 I0718 02:21:19.696393 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:21:19.696421 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.71064 ( 1 = 9.71064 loss) I0718 02:21:19.696475 23375 sgd_solver.cpp:105] Iteration 12400, lr = 0.001 I0718 02:23:00.995018 23375 solver.cpp:218] Iteration 12500 (0.987159 iter/s, 101.301s/100 iters), loss = 9.75937 I0718 02:23:00.995246 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:23:00.995348 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.75937 ( 1 = 9.75937 loss) I0718 02:23:00.995421 23375 sgd_solver.cpp:105] Iteration 12500, lr = 0.001 I0718 02:24:42.110776 23375 solver.cpp:218] Iteration 12600 (0.988946 iter/s, 101.118s/100 iters), loss = 9.79198 I0718 02:24:42.110944 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:24:42.110966 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.79198 ( 1 = 9.79198 loss) I0718 02:24:42.110983 23375 sgd_solver.cpp:105] Iteration 12600, lr = 0.001 I0718 02:26:23.103086 23375 solver.cpp:218] Iteration 12700 (0.990155 iter/s, 100.994s/100 iters), loss = 9.78138 I0718 02:26:23.103265 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:26:23.103286 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.78138 ( 1 = 9.78138 loss) I0718 02:26:23.103303 23375 sgd_solver.cpp:105] Iteration 12700, lr = 0.001 I0718 02:28:04.318502 23375 solver.cpp:218] Iteration 12800 (0.987972 iter/s, 101.217s/100 iters), loss = 9.85041 I0718 02:28:04.324422 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:28:04.324460 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.85041 ( 1 = 9.85041 loss) I0718 02:28:04.324486 23375 sgd_solver.cpp:105] Iteration 12800, lr = 0.001 I0718 02:29:45.189729 23375 solver.cpp:218] Iteration 12900 (0.9914 iter/s, 100.867s/100 iters), loss = 9.79824 I0718 02:29:45.189877 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:29:45.189909 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.79824 ( 1 = 9.79824 loss) I0718 02:29:45.189927 23375 sgd_solver.cpp:105] Iteration 12900, lr = 0.001 I0718 02:31:26.785132 23375 solver.cpp:218] Iteration 13000 (0.984277 iter/s, 101.597s/100 iters), loss = 9.86833 I0718 02:31:26.785362 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:31:26.785403 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.86833 ( 1 = 9.86833 loss) I0718 02:31:26.785431 23375 sgd_solver.cpp:105] Iteration 13000, lr = 0.001 I0718 02:33:07.923915 23375 solver.cpp:218] Iteration 13100 (0.988721 iter/s, 101.141s/100 iters), loss = 9.86156 I0718 02:33:07.924057 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:33:07.924078 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.86156 ( 1 = 9.86156 loss) I0718 02:33:07.924121 23375 sgd_solver.cpp:105] Iteration 13100, lr = 0.001 I0718 02:34:49.499821 23375 solver.cpp:218] Iteration 13200 (0.984466 iter/s, 101.578s/100 iters), loss = 9.88825 I0718 02:34:49.500028 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:34:49.500084 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.88825 ( 1 = 9.88825 loss) I0718 02:34:49.500123 23375 sgd_solver.cpp:105] Iteration 13200, lr = 0.001 I0718 02:36:30.655930 23375 solver.cpp:218] Iteration 13300 (0.988552 iter/s, 101.158s/100 iters), loss = 9.86975 I0718 02:36:30.656152 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:36:30.656201 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.86975 ( 1 = 9.86975 loss) I0718 02:36:30.656232 23375 sgd_solver.cpp:105] Iteration 13300, lr = 0.001 I0718 02:38:11.857650 23375 solver.cpp:218] Iteration 13400 (0.988106 iter/s, 101.204s/100 iters), loss = 9.86568 I0718 02:38:11.857839 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:38:11.857867 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.86568 ( 1 = 9.86568 loss) I0718 02:38:11.857887 23375 sgd_solver.cpp:105] Iteration 13400, lr = 0.001 I0718 02:39:53.006444 23375 solver.cpp:218] Iteration 13500 (0.988623 iter/s, 101.151s/100 iters), loss = 9.92247 I0718 02:39:53.006673 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:39:53.006705 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.92247 ( 1 = 9.92247 loss) I0718 02:39:53.006731 23375 sgd_solver.cpp:105] Iteration 13500, lr = 0.001 I0718 02:41:34.374572 23375 solver.cpp:218] Iteration 13600 (0.986484 iter/s, 101.37s/100 iters), loss = 9.91687 I0718 02:41:34.374780 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:41:34.374842 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.91687 ( 1 = 9.91687 loss) I0718 02:41:34.374889 23375 sgd_solver.cpp:105] Iteration 13600, lr = 0.001 I0718 02:43:15.572646 23375 solver.cpp:218] Iteration 13700 (0.988142 iter/s, 101.2s/100 iters), loss = 10.036 I0718 02:43:15.572847 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:43:15.572913 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.036 ( 1 = 10.036 loss) I0718 02:43:15.572952 23375 sgd_solver.cpp:105] Iteration 13700, lr = 0.001 I0718 02:44:56.751641 23375 solver.cpp:218] Iteration 13800 (0.988328 iter/s, 101.181s/100 iters), loss = 10.0123 I0718 02:44:56.751866 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:44:56.751924 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.0123 ( 1 = 10.0123 loss) I0718 02:44:56.751978 23375 sgd_solver.cpp:105] Iteration 13800, lr = 0.001 I0718 02:46:37.961447 23375 solver.cpp:218] Iteration 13900 (0.988027 iter/s, 101.212s/100 iters), loss = 9.98887 I0718 02:46:37.961634 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:46:37.961658 23375 solver.cpp:237] Train net output #1: softmax_loss = 9.98887 ( 1 = 9.98887 loss) I0718 02:46:37.961674 23375 sgd_solver.cpp:105] Iteration 13900, lr = 0.001 I0718 02:48:19.150601 23375 solver.cpp:218] Iteration 14000 (0.988229 iter/s, 101.191s/100 iters), loss = 10.0788 I0718 02:48:19.150749 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:48:19.150773 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.0788 ( 1 = 10.0788 loss) I0718 02:48:19.150789 23375 sgd_solver.cpp:105] Iteration 14000, lr = 0.001 I0718 02:50:00.538832 23375 solver.cpp:218] Iteration 14100 (0.986288 iter/s, 101.39s/100 iters), loss = 10.2177 I0718 02:50:00.539016 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:50:00.539049 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.2177 ( 1 = 10.2177 loss) I0718 02:50:00.539077 23375 sgd_solver.cpp:105] Iteration 14100, lr = 0.001 I0718 02:51:41.650775 23375 solver.cpp:218] Iteration 14200 (0.988984 iter/s, 101.114s/100 iters), loss = 10.1459 I0718 02:51:41.650957 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:51:41.651010 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.1459 ( 1 = 10.1459 loss) I0718 02:51:41.651036 23375 sgd_solver.cpp:105] Iteration 14200, lr = 0.001 I0718 02:53:22.820586 23375 solver.cpp:218] Iteration 14300 (0.988418 iter/s, 101.172s/100 iters), loss = 10.223 I0718 02:53:22.820796 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:53:22.820837 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.223 ( 1 = 10.223 loss) I0718 02:53:22.820865 23375 sgd_solver.cpp:105] Iteration 14300, lr = 0.001 I0718 02:55:03.754406 23375 solver.cpp:218] Iteration 14400 (0.990729 iter/s, 100.936s/100 iters), loss = 10.2284 I0718 02:55:03.754614 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:55:03.754647 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.2284 ( 1 = 10.2284 loss) I0718 02:55:03.754668 23375 sgd_solver.cpp:105] Iteration 14400, lr = 0.001 I0718 02:56:44.712685 23375 solver.cpp:218] Iteration 14500 (0.990489 iter/s, 100.96s/100 iters), loss = 10.4358 I0718 02:56:44.712895 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:56:44.712944 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.4358 ( 1 = 10.4358 loss) I0718 02:56:44.712994 23375 sgd_solver.cpp:105] Iteration 14500, lr = 0.001 I0718 02:58:26.059729 23375 solver.cpp:218] Iteration 14600 (0.986689 iter/s, 101.349s/100 iters), loss = 10.595 I0718 02:58:26.060015 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 02:58:26.060066 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.595 ( 1 = 10.595 loss) I0718 02:58:26.060097 23375 sgd_solver.cpp:105] Iteration 14600, lr = 0.001 I0718 03:00:07.232529 23375 solver.cpp:218] Iteration 14700 (0.988389 iter/s, 101.175s/100 iters), loss = 10.593 I0718 03:00:07.232692 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:00:07.232707 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.593 ( 1 = 10.593 loss) I0718 03:00:07.232718 23375 sgd_solver.cpp:105] Iteration 14700, lr = 0.001 I0718 03:01:48.242653 23375 solver.cpp:218] Iteration 14800 (0.98998 iter/s, 101.012s/100 iters), loss = 10.7012 I0718 03:01:48.242818 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:01:48.242835 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.7012 ( 1 = 10.7012 loss) I0718 03:01:48.242846 23375 sgd_solver.cpp:105] Iteration 14800, lr = 0.001 I0718 03:03:29.195673 23375 solver.cpp:218] Iteration 14900 (0.99054 iter/s, 100.955s/100 iters), loss = 10.5982 I0718 03:03:29.195848 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:03:29.195873 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.5982 ( 1 = 10.5982 loss) I0718 03:03:29.195902 23375 sgd_solver.cpp:105] Iteration 14900, lr = 0.001 I0718 03:05:10.552527 23375 solver.cpp:218] Iteration 15000 (0.986594 iter/s, 101.359s/100 iters), loss = 10.6676 I0718 03:05:10.552716 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:05:10.552742 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.6676 ( 1 = 10.6676 loss) I0718 03:05:10.552762 23375 sgd_solver.cpp:105] Iteration 15000, lr = 0.001 I0718 03:06:51.931542 23375 solver.cpp:218] Iteration 15100 (0.986378 iter/s, 101.381s/100 iters), loss = 10.7951 I0718 03:06:51.931728 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:06:51.931763 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.7951 ( 1 = 10.7951 loss) I0718 03:06:51.931804 23375 sgd_solver.cpp:105] Iteration 15100, lr = 0.001 I0718 03:08:33.012925 23375 solver.cpp:218] Iteration 15200 (0.989282 iter/s, 101.083s/100 iters), loss = 10.8098 I0718 03:08:33.013139 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:08:33.013178 23375 solver.cpp:237] Train net output #1: softmax_loss = 10.8098 ( 1 = 10.8098 loss) I0718 03:08:33.013204 23375 sgd_solver.cpp:105] Iteration 15200, lr = 0.001 I0718 03:10:13.910589 23375 solver.cpp:218] Iteration 15300 (0.991084 iter/s, 100.9s/100 iters), loss = 11.204 I0718 03:10:13.910778 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:10:13.910814 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.204 ( 1 = 11.204 loss) I0718 03:10:13.910836 23375 sgd_solver.cpp:105] Iteration 15300, lr = 0.001 I0718 03:11:55.425107 23375 solver.cpp:218] Iteration 15400 (0.985061 iter/s, 101.517s/100 iters), loss = 11.0729 I0718 03:11:55.425320 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:11:55.425374 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.0729 ( 1 = 11.0729 loss) I0718 03:11:55.425410 23375 sgd_solver.cpp:105] Iteration 15400, lr = 0.001 I0718 03:13:36.269032 23375 solver.cpp:218] Iteration 15500 (0.991612 iter/s, 100.846s/100 iters), loss = 11.6686 I0718 03:13:36.269181 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:13:36.269207 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.6686 ( 1 = 11.6686 loss) I0718 03:13:36.269220 23375 sgd_solver.cpp:105] Iteration 15500, lr = 0.001 I0718 03:15:17.771124 23375 solver.cpp:218] Iteration 15600 (0.985182 iter/s, 101.504s/100 iters), loss = 11.3176 I0718 03:15:17.771304 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:15:17.771325 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.3176 ( 1 = 11.3176 loss) I0718 03:15:17.771340 23375 sgd_solver.cpp:105] Iteration 15600, lr = 0.001 I0718 03:16:58.923072 23375 solver.cpp:218] Iteration 15700 (0.988592 iter/s, 101.154s/100 iters), loss = 11.325 I0718 03:16:58.923270 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:16:58.923287 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.325 ( 1 = 11.325 loss) I0718 03:16:58.923298 23375 sgd_solver.cpp:105] Iteration 15700, lr = 0.001 I0718 03:18:40.182162 23375 solver.cpp:218] Iteration 15800 (0.987546 iter/s, 101.261s/100 iters), loss = 11.6198 I0718 03:18:40.182380 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:18:40.182448 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.6198 ( 1 = 11.6198 loss) I0718 03:18:40.182495 23375 sgd_solver.cpp:105] Iteration 15800, lr = 0.001 I0718 03:20:20.931315 23375 solver.cpp:218] Iteration 15900 (0.992545 iter/s, 100.751s/100 iters), loss = 11.4196 I0718 03:20:20.931530 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:20:20.931555 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.4196 ( 1 = 11.4196 loss) I0718 03:20:20.931577 23375 sgd_solver.cpp:105] Iteration 15900, lr = 0.001 I0718 03:22:02.503970 23382 sgd_solver.cpp:46] MultiStep Status: Iteration 16000, step = 1 I0718 03:22:02.503943 23375 solver.cpp:218] Iteration 16000 (0.984498 iter/s, 101.575s/100 iters), loss = 11.9011 I0718 03:22:02.504101 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:22:02.504115 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.9011 ( 1 = 11.9011 loss) I0718 03:22:02.504127 23375 sgd_solver.cpp:46] MultiStep Status: Iteration 16000, step = 1 I0718 03:22:02.504132 23375 sgd_solver.cpp:105] Iteration 16000, lr = 0.0001 I0718 03:23:43.836830 23375 solver.cpp:218] Iteration 16100 (0.986827 iter/s, 101.335s/100 iters), loss = 11.5191 I0718 03:23:43.837067 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:23:43.837167 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.5191 ( 1 = 11.5191 loss) I0718 03:23:43.837235 23375 sgd_solver.cpp:105] Iteration 16100, lr = 0.0001 I0718 03:25:25.219125 23375 solver.cpp:218] Iteration 16200 (0.986347 iter/s, 101.384s/100 iters), loss = 11.7713 I0718 03:25:25.219349 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:25:25.219401 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.7713 ( 1 = 11.7713 loss) I0718 03:25:25.219434 23375 sgd_solver.cpp:105] Iteration 16200, lr = 0.0001 I0718 03:27:07.024163 23375 solver.cpp:218] Iteration 16300 (0.982251 iter/s, 101.807s/100 iters), loss = 11.9396 I0718 03:27:07.024397 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:27:07.024464 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.9396 ( 1 = 11.9396 loss) I0718 03:27:07.024497 23375 sgd_solver.cpp:105] Iteration 16300, lr = 0.0001 I0718 03:28:48.423980 23375 solver.cpp:218] Iteration 16400 (0.986176 iter/s, 101.402s/100 iters), loss = 12.5972 I0718 03:28:48.424211 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:28:48.424264 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.5972 ( 1 = 12.5972 loss) I0718 03:28:48.424305 23375 sgd_solver.cpp:105] Iteration 16400, lr = 0.0001 I0718 03:30:30.105623 23375 solver.cpp:218] Iteration 16500 (0.983443 iter/s, 101.684s/100 iters), loss = 12.3007 I0718 03:30:30.105859 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:30:30.105926 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.3007 ( 1 = 12.3007 loss) I0718 03:30:30.105973 23375 sgd_solver.cpp:105] Iteration 16500, lr = 0.0001 I0718 03:32:11.550696 23375 solver.cpp:218] Iteration 16600 (0.985736 iter/s, 101.447s/100 iters), loss = 12.1697 I0718 03:32:11.550879 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:32:11.550930 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.1697 ( 1 = 12.1697 loss) I0718 03:32:11.550992 23375 sgd_solver.cpp:105] Iteration 16600, lr = 0.0001 I0718 03:33:52.923959 23375 solver.cpp:218] Iteration 16700 (0.986434 iter/s, 101.375s/100 iters), loss = 11.5297 I0718 03:33:52.924245 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:33:52.924280 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.5297 ( 1 = 11.5297 loss) I0718 03:33:52.924307 23375 sgd_solver.cpp:105] Iteration 16700, lr = 0.0001 I0718 03:35:34.712141 23375 solver.cpp:218] Iteration 16800 (0.982414 iter/s, 101.79s/100 iters), loss = 12.0361 I0718 03:35:34.712353 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:35:34.712373 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.0361 ( 1 = 12.0361 loss) I0718 03:35:34.712385 23375 sgd_solver.cpp:105] Iteration 16800, lr = 0.0001 I0718 03:37:16.118378 23375 solver.cpp:218] Iteration 16900 (0.986114 iter/s, 101.408s/100 iters), loss = 12.9416 I0718 03:37:16.118590 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:37:16.118615 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.9416 ( 1 = 12.9416 loss) I0718 03:37:16.118638 23375 sgd_solver.cpp:105] Iteration 16900, lr = 0.0001 I0718 03:38:57.707641 23375 solver.cpp:218] Iteration 17000 (0.984337 iter/s, 101.591s/100 iters), loss = 12.3086 I0718 03:38:57.707798 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:38:57.707813 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.3086 ( 1 = 12.3086 loss) I0718 03:38:57.707828 23375 sgd_solver.cpp:105] Iteration 17000, lr = 0.0001 I0718 03:40:39.504076 23375 solver.cpp:218] Iteration 17100 (0.982333 iter/s, 101.798s/100 iters), loss = 12.5921 I0718 03:40:39.504298 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:40:39.504427 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.5921 ( 1 = 12.5921 loss) I0718 03:40:39.504500 23375 sgd_solver.cpp:105] Iteration 17100, lr = 0.0001 I0718 03:42:21.242467 23375 solver.cpp:218] Iteration 17200 (0.982894 iter/s, 101.74s/100 iters), loss = 12.9024 I0718 03:42:21.242656 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:42:21.242682 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.9024 ( 1 = 12.9024 loss) I0718 03:42:21.242708 23375 sgd_solver.cpp:105] Iteration 17200, lr = 0.0001 I0718 03:44:02.881615 23375 solver.cpp:218] Iteration 17300 (0.983854 iter/s, 101.641s/100 iters), loss = 12.1955 I0718 03:44:02.881870 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:44:02.881908 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.1955 ( 1 = 12.1955 loss) I0718 03:44:02.881935 23375 sgd_solver.cpp:105] Iteration 17300, lr = 0.0001 I0718 03:45:44.614516 23375 solver.cpp:218] Iteration 17400 (0.982948 iter/s, 101.735s/100 iters), loss = 11.9262 I0718 03:45:44.614753 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:45:44.614854 23375 solver.cpp:237] Train net output #1: softmax_loss = 11.9262 ( 1 = 11.9262 loss) I0718 03:45:44.614928 23375 sgd_solver.cpp:105] Iteration 17400, lr = 0.0001 I0718 03:47:26.181357 23375 solver.cpp:218] Iteration 17500 (0.984554 iter/s, 101.569s/100 iters), loss = 12.3545 I0718 03:47:26.181530 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:47:26.181558 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.3545 ( 1 = 12.3545 loss) I0718 03:47:26.181572 23375 sgd_solver.cpp:105] Iteration 17500, lr = 0.0001 I0718 03:49:07.985127 23375 solver.cpp:218] Iteration 17600 (0.982262 iter/s, 101.806s/100 iters), loss = 12.3374 I0718 03:49:07.985288 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:49:07.985311 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.3374 ( 1 = 12.3374 loss) I0718 03:49:07.985337 23375 sgd_solver.cpp:105] Iteration 17600, lr = 0.0001 I0718 03:50:49.738795 23375 solver.cpp:218] Iteration 17700 (0.982746 iter/s, 101.756s/100 iters), loss = 13.0622 I0718 03:50:49.739045 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:50:49.739085 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.0622 ( 1 = 13.0622 loss) I0718 03:50:49.739109 23375 sgd_solver.cpp:105] Iteration 17700, lr = 0.0001 I0718 03:52:31.356045 23375 solver.cpp:218] Iteration 17800 (0.984066 iter/s, 101.619s/100 iters), loss = 13.0866 I0718 03:52:31.356253 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:52:31.356292 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.0866 ( 1 = 13.0866 loss) I0718 03:52:31.356307 23375 sgd_solver.cpp:105] Iteration 17800, lr = 0.0001 I0718 03:54:13.171886 23375 solver.cpp:218] Iteration 17900 (0.982146 iter/s, 101.818s/100 iters), loss = 12.2311 I0718 03:54:13.172099 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:54:13.172199 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.2311 ( 1 = 12.2311 loss) I0718 03:54:13.172266 23375 sgd_solver.cpp:105] Iteration 17900, lr = 0.0001 I0718 03:55:55.138980 23375 solver.cpp:218] Iteration 18000 (0.980689 iter/s, 101.969s/100 iters), loss = 12.9129 I0718 03:55:55.139223 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:55:55.139283 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.9129 ( 1 = 12.9129 loss) I0718 03:55:55.139353 23375 sgd_solver.cpp:105] Iteration 18000, lr = 0.0001 I0718 03:57:36.926277 23375 solver.cpp:218] Iteration 18100 (0.982422 iter/s, 101.789s/100 iters), loss = 13.2387 I0718 03:57:36.926478 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:57:36.926528 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.2387 ( 1 = 13.2387 loss) I0718 03:57:36.926559 23375 sgd_solver.cpp:105] Iteration 18100, lr = 0.0001 I0718 03:59:18.695649 23375 solver.cpp:218] Iteration 18200 (0.982595 iter/s, 101.771s/100 iters), loss = 13.4404 I0718 03:59:18.695840 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 03:59:18.695952 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.4404 ( 1 = 13.4404 loss) I0718 03:59:18.696030 23375 sgd_solver.cpp:105] Iteration 18200, lr = 0.0001 I0718 04:01:00.688779 23375 solver.cpp:218] Iteration 18300 (0.980439 iter/s, 101.995s/100 iters), loss = 13.0604 I0718 04:01:00.688961 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:01:00.688977 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.0604 ( 1 = 13.0604 loss) I0718 04:01:00.688987 23375 sgd_solver.cpp:105] Iteration 18300, lr = 0.0001 I0718 04:02:41.645645 23375 solver.cpp:218] Iteration 18400 (0.990503 iter/s, 100.959s/100 iters), loss = 12.8835 I0718 04:02:41.645860 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:02:41.645913 23375 solver.cpp:237] Train net output #1: softmax_loss = 12.8835 ( 1 = 12.8835 loss) I0718 04:02:41.645946 23375 sgd_solver.cpp:105] Iteration 18400, lr = 0.0001 I0718 04:04:22.581609 23375 solver.cpp:218] Iteration 18500 (0.990708 iter/s, 100.938s/100 iters), loss = 13.5228 I0718 04:04:22.581841 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:04:22.581879 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.5228 ( 1 = 13.5228 loss) I0718 04:04:22.581904 23375 sgd_solver.cpp:105] Iteration 18500, lr = 0.0001 I0718 04:06:03.344220 23375 solver.cpp:218] Iteration 18600 (0.992413 iter/s, 100.765s/100 iters), loss = 13.7903 I0718 04:06:03.344441 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:06:03.344509 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.7903 ( 1 = 13.7903 loss) I0718 04:06:03.344558 23375 sgd_solver.cpp:105] Iteration 18600, lr = 0.0001 I0718 04:07:43.337277 23375 solver.cpp:218] Iteration 18700 (1.00005 iter/s, 99.995s/100 iters), loss = 13.8854 I0718 04:07:43.337550 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:07:43.337589 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.8854 ( 1 = 13.8854 loss) I0718 04:07:43.337616 23375 sgd_solver.cpp:105] Iteration 18700, lr = 0.0001 I0718 04:09:23.478971 23375 solver.cpp:218] Iteration 18800 (0.998566 iter/s, 100.144s/100 iters), loss = 14.1654 I0718 04:09:23.479183 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:09:23.479198 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.1654 ( 1 = 14.1654 loss) I0718 04:09:23.479209 23375 sgd_solver.cpp:105] Iteration 18800, lr = 0.0001 I0718 04:11:03.865905 23375 solver.cpp:218] Iteration 18900 (0.996126 iter/s, 100.389s/100 iters), loss = 13.4725 I0718 04:11:03.866143 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:11:03.866194 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.4725 ( 1 = 13.4725 loss) I0718 04:11:03.866226 23375 sgd_solver.cpp:105] Iteration 18900, lr = 0.0001 I0718 04:12:43.354475 23375 solver.cpp:218] Iteration 19000 (1.00512 iter/s, 99.4905s/100 iters), loss = 13.3963 I0718 04:12:43.354707 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:12:43.354760 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.3963 ( 1 = 13.3963 loss) I0718 04:12:43.354802 23375 sgd_solver.cpp:105] Iteration 19000, lr = 0.0001 I0718 04:14:23.426224 23375 solver.cpp:218] Iteration 19100 (0.999264 iter/s, 100.074s/100 iters), loss = 13.9681 I0718 04:14:23.426466 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:14:23.426518 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.9681 ( 1 = 13.9681 loss) I0718 04:14:23.426558 23375 sgd_solver.cpp:105] Iteration 19100, lr = 0.0001 I0718 04:16:03.649289 23375 solver.cpp:218] Iteration 19200 (0.997755 iter/s, 100.225s/100 iters), loss = 13.8304 I0718 04:16:03.649545 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:16:03.649585 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.8304 ( 1 = 13.8304 loss) I0718 04:16:03.649610 23375 sgd_solver.cpp:105] Iteration 19200, lr = 0.0001 I0718 04:17:43.198233 23375 solver.cpp:218] Iteration 19300 (1.00451 iter/s, 99.5508s/100 iters), loss = 13.8486 I0718 04:17:43.198426 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:17:43.198463 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.8486 ( 1 = 13.8486 loss) I0718 04:17:43.198493 23375 sgd_solver.cpp:105] Iteration 19300, lr = 0.0001 I0718 04:19:23.259099 23375 solver.cpp:218] Iteration 19400 (0.999372 iter/s, 100.063s/100 iters), loss = 14.1039 I0718 04:19:23.259367 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:19:23.259440 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.1039 ( 1 = 14.1039 loss) I0718 04:19:23.259487 23375 sgd_solver.cpp:105] Iteration 19400, lr = 0.0001 I0718 04:21:02.853718 23375 solver.cpp:218] Iteration 19500 (1.00405 iter/s, 99.5965s/100 iters), loss = 13.9205 I0718 04:21:02.853945 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:21:02.854017 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.9205 ( 1 = 13.9205 loss) I0718 04:21:02.854065 23375 sgd_solver.cpp:105] Iteration 19500, lr = 0.0001 I0718 04:22:42.746265 23375 solver.cpp:218] Iteration 19600 (1.00106 iter/s, 99.8945s/100 iters), loss = 14.9087 I0718 04:22:42.746497 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:22:42.746551 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.9087 ( 1 = 14.9087 loss) I0718 04:22:42.746592 23375 sgd_solver.cpp:105] Iteration 19600, lr = 0.0001 I0718 04:24:22.662598 23375 solver.cpp:218] Iteration 19700 (1.00082 iter/s, 99.9183s/100 iters), loss = 13.4047 I0718 04:24:22.662863 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:24:22.662915 23375 solver.cpp:237] Train net output #1: softmax_loss = 13.4047 ( 1 = 13.4047 loss) I0718 04:24:22.662950 23375 sgd_solver.cpp:105] Iteration 19700, lr = 0.0001 I0718 04:26:02.364008 23375 solver.cpp:218] Iteration 19800 (1.00298 iter/s, 99.7033s/100 iters), loss = 14.8873 I0718 04:26:02.364228 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:26:02.364251 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.8873 ( 1 = 14.8873 loss) I0718 04:26:02.364269 23375 sgd_solver.cpp:105] Iteration 19800, lr = 0.0001 I0718 04:27:42.089037 23375 solver.cpp:218] Iteration 19900 (1.00274 iter/s, 99.7269s/100 iters), loss = 14.4596 I0718 04:27:42.089349 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:27:42.089398 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.4596 ( 1 = 14.4596 loss) I0718 04:27:42.089431 23375 sgd_solver.cpp:105] Iteration 19900, lr = 0.0001 I0718 04:29:21.890444 23375 solver.cpp:218] Iteration 20000 (1.00197 iter/s, 99.8032s/100 iters), loss = 15.204 I0718 04:29:21.890668 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:29:21.890709 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.204 ( 1 = 15.204 loss) I0718 04:29:21.890725 23375 sgd_solver.cpp:105] Iteration 20000, lr = 0.0001 I0718 04:31:01.822417 23375 solver.cpp:218] Iteration 20100 (1.00066 iter/s, 99.9339s/100 iters), loss = 14.7785 I0718 04:31:01.822662 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:31:01.822715 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.7785 ( 1 = 14.7785 loss) I0718 04:31:01.822755 23375 sgd_solver.cpp:105] Iteration 20100, lr = 0.0001 I0718 04:32:41.817499 23375 solver.cpp:218] Iteration 20200 (1.00003 iter/s, 99.997s/100 iters), loss = 15.4901 I0718 04:32:41.817747 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:32:41.817800 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.4901 ( 1 = 15.4901 loss) I0718 04:32:41.817834 23375 sgd_solver.cpp:105] Iteration 20200, lr = 0.0001 I0718 04:34:21.743630 23375 solver.cpp:218] Iteration 20300 (1.00072 iter/s, 99.928s/100 iters), loss = 14.8394 I0718 04:34:21.743849 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:34:21.743870 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.8394 ( 1 = 14.8394 loss) I0718 04:34:21.743885 23375 sgd_solver.cpp:105] Iteration 20300, lr = 0.0001 I0718 04:36:01.669637 23375 solver.cpp:218] Iteration 20400 (1.00072 iter/s, 99.9279s/100 iters), loss = 14.0352 I0718 04:36:01.669893 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:36:01.669945 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.0352 ( 1 = 14.0352 loss) I0718 04:36:01.669981 23375 sgd_solver.cpp:105] Iteration 20400, lr = 0.0001 I0718 04:37:41.330935 23375 solver.cpp:218] Iteration 20500 (1.00338 iter/s, 99.6632s/100 iters), loss = 14.8281 I0718 04:37:41.331174 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:37:41.331209 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.8281 ( 1 = 14.8281 loss) I0718 04:37:41.331238 23375 sgd_solver.cpp:105] Iteration 20500, lr = 0.0001 I0718 04:39:20.986649 23375 solver.cpp:218] Iteration 20600 (1.00344 iter/s, 99.6576s/100 iters), loss = 15.9396 I0718 04:39:20.986860 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:39:20.986879 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.9396 ( 1 = 15.9396 loss) I0718 04:39:20.986894 23375 sgd_solver.cpp:105] Iteration 20600, lr = 0.0001 I0718 04:41:00.709322 23375 solver.cpp:218] Iteration 20700 (1.00276 iter/s, 99.7246s/100 iters), loss = 14.935 I0718 04:41:00.709509 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:41:00.709532 23375 solver.cpp:237] Train net output #1: softmax_loss = 14.935 ( 1 = 14.935 loss) I0718 04:41:00.709553 23375 sgd_solver.cpp:105] Iteration 20700, lr = 0.0001 I0718 04:42:40.648712 23375 solver.cpp:218] Iteration 20800 (1.00059 iter/s, 99.9413s/100 iters), loss = 15.1079 I0718 04:42:40.648938 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:42:40.648972 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.1079 ( 1 = 15.1079 loss) I0718 04:42:40.648999 23375 sgd_solver.cpp:105] Iteration 20800, lr = 0.0001 I0718 04:44:20.543596 23375 solver.cpp:218] Iteration 20900 (1.00103 iter/s, 99.8968s/100 iters), loss = 15.949 I0718 04:44:20.543844 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:44:20.543882 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.949 ( 1 = 15.949 loss) I0718 04:44:20.543905 23375 sgd_solver.cpp:105] Iteration 20900, lr = 0.0001 I0718 04:46:00.340456 23375 solver.cpp:218] Iteration 21000 (1.00202 iter/s, 99.7988s/100 iters), loss = 15.5318 I0718 04:46:00.340636 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:46:00.340653 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.5318 ( 1 = 15.5318 loss) I0718 04:46:00.340665 23375 sgd_solver.cpp:105] Iteration 21000, lr = 0.0001 I0718 04:47:40.057806 23375 solver.cpp:218] Iteration 21100 (1.00281 iter/s, 99.7193s/100 iters), loss = 16.1853 I0718 04:47:40.058023 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:47:40.058053 23375 solver.cpp:237] Train net output #1: softmax_loss = 16.1853 ( 1 = 16.1853 loss) I0718 04:47:40.058075 23375 sgd_solver.cpp:105] Iteration 21100, lr = 0.0001 I0718 04:49:20.149658 23375 solver.cpp:218] Iteration 21200 (0.999063 iter/s, 100.094s/100 iters), loss = 15.4122 I0718 04:49:20.156378 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:49:20.156399 23375 solver.cpp:237] Train net output #1: softmax_loss = 15.4122 ( 1 = 15.4122 loss) I0718 04:49:20.156412 23375 sgd_solver.cpp:105] Iteration 21200, lr = 0.0001 I0718 04:51:00.201026 23375 solver.cpp:218] Iteration 21300 (0.999532 iter/s, 100.047s/100 iters), loss = 16.8806 I0718 04:51:00.201215 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:51:00.201277 23375 solver.cpp:237] Train net output #1: softmax_loss = 16.8806 ( 1 = 16.8806 loss) I0718 04:51:00.201326 23375 sgd_solver.cpp:105] Iteration 21300, lr = 0.0001 I0718 04:52:39.898344 23375 solver.cpp:218] Iteration 21400 (1.00302 iter/s, 99.6993s/100 iters), loss = 16.6844 I0718 04:52:39.898587 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:52:39.898623 23375 solver.cpp:237] Train net output #1: softmax_loss = 16.6844 ( 1 = 16.6844 loss) I0718 04:52:39.898655 23375 sgd_solver.cpp:105] Iteration 21400, lr = 0.0001 I0718 04:54:19.660748 23375 solver.cpp:218] Iteration 21500 (1.00236 iter/s, 99.7643s/100 iters), loss = 16.895 I0718 04:54:19.660974 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:54:19.661027 23375 solver.cpp:237] Train net output #1: softmax_loss = 16.895 ( 1 = 16.895 loss) I0718 04:54:19.661068 23375 sgd_solver.cpp:105] Iteration 21500, lr = 0.0001 I0718 04:55:59.497836 23375 solver.cpp:218] Iteration 21600 (1.00161 iter/s, 99.839s/100 iters), loss = 17.2402 I0718 04:55:59.498066 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:55:59.498118 23375 solver.cpp:237] Train net output #1: softmax_loss = 17.2402 ( 1 = 17.2402 loss) I0718 04:55:59.498152 23375 sgd_solver.cpp:105] Iteration 21600, lr = 0.0001 I0718 04:57:39.620931 23375 solver.cpp:218] Iteration 21700 (0.998751 iter/s, 100.125s/100 iters), loss = 17.5965 I0718 04:57:39.621168 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:57:39.621240 23375 solver.cpp:237] Train net output #1: softmax_loss = 17.5965 ( 1 = 17.5965 loss) I0718 04:57:39.621289 23375 sgd_solver.cpp:105] Iteration 21700, lr = 0.0001 I0718 04:59:19.231623 23375 solver.cpp:218] Iteration 21800 (1.00389 iter/s, 99.6126s/100 iters), loss = 17.853 I0718 04:59:19.231856 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 04:59:19.231904 23375 solver.cpp:237] Train net output #1: softmax_loss = 17.853 ( 1 = 17.853 loss) I0718 04:59:19.231927 23375 sgd_solver.cpp:105] Iteration 21800, lr = 0.0001 I0718 05:00:59.481902 23375 solver.cpp:218] Iteration 21900 (0.997484 iter/s, 100.252s/100 iters), loss = 18.0341 I0718 05:00:59.482137 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:00:59.482177 23375 solver.cpp:237] Train net output #1: softmax_loss = 18.0341 ( 1 = 18.0341 loss) I0718 05:00:59.482203 23375 sgd_solver.cpp:105] Iteration 21900, lr = 0.0001 I0718 05:02:39.687793 23375 solver.cpp:218] Iteration 22000 (0.997926 iter/s, 100.208s/100 iters), loss = 17.6992 I0718 05:02:39.688107 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:02:39.688144 23375 solver.cpp:237] Train net output #1: softmax_loss = 17.6992 ( 1 = 17.6992 loss) I0718 05:02:39.688170 23375 sgd_solver.cpp:105] Iteration 22000, lr = 0.0001 I0718 05:04:19.452347 23375 solver.cpp:218] Iteration 22100 (1.00234 iter/s, 99.7664s/100 iters), loss = 18.5267 I0718 05:04:19.452574 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:04:19.452603 23375 solver.cpp:237] Train net output #1: softmax_loss = 18.5267 ( 1 = 18.5267 loss) I0718 05:04:19.452625 23375 sgd_solver.cpp:105] Iteration 22100, lr = 0.0001 I0718 05:05:59.736675 23375 solver.cpp:218] Iteration 22200 (0.997146 iter/s, 100.286s/100 iters), loss = 18.9672 I0718 05:05:59.736934 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:05:59.736973 23375 solver.cpp:237] Train net output #1: softmax_loss = 18.9672 ( 1 = 18.9672 loss) I0718 05:05:59.737002 23375 sgd_solver.cpp:105] Iteration 22200, lr = 0.0001 I0718 05:07:39.994922 23375 solver.cpp:218] Iteration 22300 (0.997405 iter/s, 100.26s/100 iters), loss = 19.805 I0718 05:07:39.995091 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:07:39.995112 23375 solver.cpp:237] Train net output #1: softmax_loss = 19.805 ( 1 = 19.805 loss) I0718 05:07:39.995126 23375 sgd_solver.cpp:105] Iteration 22300, lr = 0.0001 I0718 05:09:20.517047 23375 solver.cpp:218] Iteration 22400 (0.994786 iter/s, 100.524s/100 iters), loss = 20.6046 I0718 05:09:20.517282 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:09:20.517307 23375 solver.cpp:237] Train net output #1: softmax_loss = 20.6046 ( 1 = 20.6046 loss) I0718 05:09:20.517323 23375 sgd_solver.cpp:105] Iteration 22400, lr = 0.0001 I0718 05:11:00.380619 23375 solver.cpp:218] Iteration 22500 (1.00135 iter/s, 99.8655s/100 iters), loss = 19.7292 I0718 05:11:00.380854 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:11:00.380898 23375 solver.cpp:237] Train net output #1: softmax_loss = 19.7292 ( 1 = 19.7292 loss) I0718 05:11:00.380934 23375 sgd_solver.cpp:105] Iteration 22500, lr = 0.0001 I0718 05:12:40.391530 23375 solver.cpp:218] Iteration 22600 (0.999872 iter/s, 100.013s/100 iters), loss = 20.9109 I0718 05:12:40.391777 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:12:40.391844 23375 solver.cpp:237] Train net output #1: softmax_loss = 20.9109 ( 1 = 20.9109 loss) I0718 05:12:40.391891 23375 sgd_solver.cpp:105] Iteration 22600, lr = 0.0001 I0718 05:14:20.802080 23375 solver.cpp:218] Iteration 22700 (0.995892 iter/s, 100.412s/100 iters), loss = 21.101 I0718 05:14:20.802263 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:14:20.802278 23375 solver.cpp:237] Train net output #1: softmax_loss = 21.101 ( 1 = 21.101 loss) I0718 05:14:20.802289 23375 sgd_solver.cpp:105] Iteration 22700, lr = 0.0001 I0718 05:16:01.125777 23375 solver.cpp:218] Iteration 22800 (0.996754 iter/s, 100.326s/100 iters), loss = 22.0934 I0718 05:16:01.125977 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:16:01.126011 23375 solver.cpp:237] Train net output #1: softmax_loss = 22.0934 ( 1 = 22.0934 loss) I0718 05:16:01.126039 23375 sgd_solver.cpp:105] Iteration 22800, lr = 0.0001 I0718 05:17:41.214746 23375 solver.cpp:218] Iteration 22900 (0.999092 iter/s, 100.091s/100 iters), loss = 22.1792 I0718 05:17:41.214992 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:17:41.215029 23375 solver.cpp:237] Train net output #1: softmax_loss = 22.1792 ( 1 = 22.1792 loss) I0718 05:17:41.215055 23375 sgd_solver.cpp:105] Iteration 22900, lr = 0.0001 I0718 05:19:21.709095 23375 solver.cpp:218] Iteration 23000 (0.995062 iter/s, 100.496s/100 iters), loss = 25.9184 I0718 05:19:21.709283 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:19:21.709300 23375 solver.cpp:237] Train net output #1: softmax_loss = 25.9184 ( 1 = 25.9184 loss) I0718 05:19:21.709312 23375 sgd_solver.cpp:105] Iteration 23000, lr = 0.0001 I0718 05:21:01.723312 23375 solver.cpp:218] Iteration 23100 (0.999838 iter/s, 100.016s/100 iters), loss = 27.9238 I0718 05:21:01.723593 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:21:01.723645 23375 solver.cpp:237] Train net output #1: softmax_loss = 27.9238 ( 1 = 27.9238 loss) I0718 05:21:01.723678 23375 sgd_solver.cpp:105] Iteration 23100, lr = 0.0001 I0718 05:22:41.868067 23375 solver.cpp:218] Iteration 23200 (0.998536 iter/s, 100.147s/100 iters), loss = 28.737 I0718 05:22:41.868259 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:22:41.868332 23375 solver.cpp:237] Train net output #1: softmax_loss = 28.737 ( 1 = 28.737 loss) I0718 05:22:41.868366 23375 sgd_solver.cpp:105] Iteration 23200, lr = 0.0001 I0718 05:24:22.075830 23375 solver.cpp:218] Iteration 23300 (0.997907 iter/s, 100.21s/100 iters), loss = 31.4656 I0718 05:24:22.076073 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:24:22.076125 23375 solver.cpp:237] Train net output #1: softmax_loss = 31.4656 ( 1 = 31.4656 loss) I0718 05:24:22.076159 23375 sgd_solver.cpp:105] Iteration 23300, lr = 0.0001 I0718 05:26:01.970469 23375 solver.cpp:218] Iteration 23400 (1.00104 iter/s, 99.8965s/100 iters), loss = 35.8096 I0718 05:26:01.970715 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:26:01.970777 23375 solver.cpp:237] Train net output #1: softmax_loss = 35.8096 ( 1 = 35.8096 loss) I0718 05:26:01.970809 23375 sgd_solver.cpp:105] Iteration 23400, lr = 0.0001 I0718 05:27:41.934793 23375 solver.cpp:218] Iteration 23500 (1.00034 iter/s, 99.9662s/100 iters), loss = 38.7695 I0718 05:27:41.935073 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:27:41.935127 23375 solver.cpp:237] Train net output #1: softmax_loss = 38.7695 ( 1 = 38.7695 loss) I0718 05:27:41.935158 23375 sgd_solver.cpp:105] Iteration 23500, lr = 0.0001 I0718 05:29:22.286622 23375 solver.cpp:218] Iteration 23600 (0.996475 iter/s, 100.354s/100 iters), loss = 43.6644 I0718 05:29:22.286840 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:29:22.286864 23375 solver.cpp:237] Train net output #1: softmax_loss = 43.6644 ( 1 = 43.6644 loss) I0718 05:29:22.286880 23375 sgd_solver.cpp:105] Iteration 23600, lr = 0.0001 I0718 05:31:02.214758 23375 solver.cpp:218] Iteration 23700 (1.0007 iter/s, 99.9301s/100 iters), loss = 47.6553 I0718 05:31:02.214937 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:31:02.214952 23375 solver.cpp:237] Train net output #1: softmax_loss = 47.6553 ( 1 = 47.6553 loss) I0718 05:31:02.214964 23375 sgd_solver.cpp:105] Iteration 23700, lr = 0.0001 I0718 05:32:42.143663 23375 solver.cpp:218] Iteration 23800 (1.00069 iter/s, 99.9309s/100 iters), loss = 56.6737 I0718 05:32:42.143847 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:32:42.143882 23375 solver.cpp:237] Train net output #1: softmax_loss = 56.6737 ( 1 = 56.6737 loss) I0718 05:32:42.143908 23375 sgd_solver.cpp:105] Iteration 23800, lr = 0.0001 I0718 05:34:22.459383 23375 solver.cpp:218] Iteration 23900 (0.996833 iter/s, 100.318s/100 iters), loss = 61.9464 I0718 05:34:22.459596 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:34:22.459635 23375 solver.cpp:237] Train net output #1: softmax_loss = 61.9464 ( 1 = 61.9464 loss) I0718 05:34:22.459663 23375 sgd_solver.cpp:105] Iteration 23900, lr = 0.0001 I0718 05:36:02.544474 23382 sgd_solver.cpp:46] MultiStep Status: Iteration 24000, step = 2 I0718 05:36:02.544461 23375 solver.cpp:218] Iteration 24000 (0.999131 iter/s, 100.087s/100 iters), loss = 68.1618 I0718 05:36:02.544657 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:36:02.544685 23375 solver.cpp:237] Train net output #1: softmax_loss = 68.1618 ( 1 = 68.1618 loss) I0718 05:36:02.544705 23375 sgd_solver.cpp:46] MultiStep Status: Iteration 24000, step = 2 I0718 05:36:02.544715 23375 sgd_solver.cpp:105] Iteration 24000, lr = 1e-05 I0718 05:37:42.786944 23375 solver.cpp:218] Iteration 24100 (0.997562 iter/s, 100.244s/100 iters), loss = 59.9943 I0718 05:37:42.787235 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:37:42.787271 23375 solver.cpp:237] Train net output #1: softmax_loss = 59.9943 ( 1 = 59.9943 loss) I0718 05:37:42.787299 23375 sgd_solver.cpp:105] Iteration 24100, lr = 1e-05 I0718 05:39:22.856284 23375 solver.cpp:218] Iteration 24200 (0.999288 iter/s, 100.071s/100 iters), loss = 64.8688 I0718 05:39:22.856539 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:39:22.856591 23375 solver.cpp:237] Train net output #1: softmax_loss = 64.8688 ( 1 = 64.8688 loss) I0718 05:39:22.856626 23375 sgd_solver.cpp:105] Iteration 24200, lr = 1e-05 I0718 05:41:02.379397 23375 solver.cpp:218] Iteration 24300 (1.00477 iter/s, 99.525s/100 iters), loss = 68.2187 I0718 05:41:02.379653 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:41:02.379688 23375 solver.cpp:237] Train net output #1: softmax_loss = 68.2187 ( 1 = 68.2187 loss) I0718 05:41:02.379712 23375 sgd_solver.cpp:105] Iteration 24300, lr = 1e-05 I0718 05:42:41.885717 23375 solver.cpp:218] Iteration 24400 (1.00494 iter/s, 99.5082s/100 iters), loss = 69.9827 I0718 05:42:41.885949 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:42:41.886003 23375 solver.cpp:237] Train net output #1: softmax_loss = 69.9827 ( 1 = 69.9827 loss) I0718 05:42:41.886035 23375 sgd_solver.cpp:105] Iteration 24400, lr = 1e-05 I0718 05:44:21.087523 23375 solver.cpp:218] Iteration 24500 (1.00803 iter/s, 99.2037s/100 iters), loss = 77.629 I0718 05:44:21.087703 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:44:21.087731 23375 solver.cpp:237] Train net output #1: softmax_loss = 77.629 ( 1 = 77.629 loss) I0718 05:44:21.087754 23375 sgd_solver.cpp:105] Iteration 24500, lr = 1e-05 I0718 05:46:00.381355 23375 solver.cpp:218] Iteration 24600 (1.00709 iter/s, 99.2958s/100 iters), loss = 82.4393 I0718 05:46:00.381531 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:46:00.381548 23375 solver.cpp:237] Train net output #1: softmax_loss = 82.4393 ( 1 = 82.4393 loss) I0718 05:46:00.381563 23375 sgd_solver.cpp:105] Iteration 24600, lr = 1e-05 I0718 05:47:40.185420 23375 solver.cpp:218] Iteration 24700 (1.00194 iter/s, 99.806s/100 iters), loss = 82.2192 I0718 05:47:40.185647 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:47:40.185699 23375 solver.cpp:237] Train net output #1: softmax_loss = 82.2192 ( 1 = 82.2192 loss) I0718 05:47:40.185734 23375 sgd_solver.cpp:105] Iteration 24700, lr = 1e-05 I0718 05:49:19.170284 23375 solver.cpp:218] Iteration 24800 (1.01024 iter/s, 98.9868s/100 iters), loss = 85.4827 I0718 05:49:19.170536 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:49:19.170604 23375 solver.cpp:237] Train net output #1: softmax_loss = 85.4827 ( 1 = 85.4827 loss) I0718 05:49:19.170655 23375 sgd_solver.cpp:105] Iteration 24800, lr = 1e-05 I0718 05:50:58.400024 23375 solver.cpp:218] Iteration 24900 (1.00774 iter/s, 99.2316s/100 iters), loss = 82.8186 I0718 05:50:58.400240 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:50:58.400292 23375 solver.cpp:237] Train net output #1: softmax_loss = 82.8186 ( 1 = 82.8186 loss) I0718 05:50:58.400352 23375 sgd_solver.cpp:105] Iteration 24900, lr = 1e-05 I0718 05:52:37.344985 23375 solver.cpp:218] Iteration 25000 (1.01064 iter/s, 98.9469s/100 iters), loss = 79.2832 I0718 05:52:37.345221 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:52:37.345263 23375 solver.cpp:237] Train net output #1: softmax_loss = 79.2832 ( 1 = 79.2832 loss) I0718 05:52:37.345290 23375 sgd_solver.cpp:105] Iteration 25000, lr = 1e-05 I0718 05:54:16.702543 23375 solver.cpp:218] Iteration 25100 (1.00645 iter/s, 99.3595s/100 iters), loss = 87.3365 I0718 05:54:16.702791 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:54:16.702827 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 05:54:16.702858 23375 sgd_solver.cpp:105] Iteration 25100, lr = 1e-05 I0718 05:55:39.561513 23375 solver.cpp:218] Iteration 25200 (1.20685 iter/s, 82.8605s/100 iters), loss = 87.3365 I0718 05:55:39.561780 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:55:39.561841 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 05:55:39.561874 23375 sgd_solver.cpp:105] Iteration 25200, lr = 1e-05 I0718 05:57:01.939848 23375 solver.cpp:218] Iteration 25300 (1.21389 iter/s, 82.3798s/100 iters), loss = 87.3365 I0718 05:57:01.940047 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:57:01.940065 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 05:57:01.940079 23375 sgd_solver.cpp:105] Iteration 25300, lr = 1e-05 I0718 05:58:24.279306 23375 solver.cpp:218] Iteration 25400 (1.21446 iter/s, 82.341s/100 iters), loss = 87.3365 I0718 05:58:24.279542 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:58:24.279594 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 05:58:24.279628 23375 sgd_solver.cpp:105] Iteration 25400, lr = 1e-05 I0718 05:59:46.595676 23375 solver.cpp:218] Iteration 25500 (1.2148 iter/s, 82.3179s/100 iters), loss = 87.3365 I0718 05:59:46.595825 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 05:59:46.595840 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 05:59:46.595852 23375 sgd_solver.cpp:105] Iteration 25500, lr = 1e-05 I0718 06:01:08.970805 23375 solver.cpp:218] Iteration 25600 (1.21394 iter/s, 82.3767s/100 iters), loss = 87.3365 I0718 06:01:08.971032 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:01:08.971081 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:01:08.971104 23375 sgd_solver.cpp:105] Iteration 25600, lr = 1e-05 I0718 06:02:31.289988 23375 solver.cpp:218] Iteration 25700 (1.21476 iter/s, 82.3207s/100 iters), loss = 87.3365 I0718 06:02:31.290182 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:02:31.290231 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:02:31.290269 23375 sgd_solver.cpp:105] Iteration 25700, lr = 1e-05 I0718 06:03:53.523237 23375 solver.cpp:218] Iteration 25800 (1.21603 iter/s, 82.2348s/100 iters), loss = 87.3365 I0718 06:03:53.523470 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:03:53.523505 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:03:53.523528 23375 sgd_solver.cpp:105] Iteration 25800, lr = 1e-05 I0718 06:05:15.849563 23375 solver.cpp:218] Iteration 25900 (1.21466 iter/s, 82.3278s/100 iters), loss = 87.3365 I0718 06:05:15.849723 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:05:15.849736 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:05:15.849758 23375 sgd_solver.cpp:105] Iteration 25900, lr = 1e-05 I0718 06:06:38.125905 23375 solver.cpp:218] Iteration 26000 (1.21539 iter/s, 82.2779s/100 iters), loss = 87.3365 I0718 06:06:38.126121 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:06:38.126163 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:06:38.126199 23375 sgd_solver.cpp:105] Iteration 26000, lr = 1e-05 I0718 06:08:00.496372 23375 solver.cpp:218] Iteration 26100 (1.214 iter/s, 82.372s/100 iters), loss = 87.3365 I0718 06:08:00.496546 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:08:00.496568 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:08:00.496579 23375 sgd_solver.cpp:105] Iteration 26100, lr = 1e-05 I0718 06:09:22.831267 23375 solver.cpp:218] Iteration 26200 (1.21453 iter/s, 82.3364s/100 iters), loss = 87.3365 I0718 06:09:22.831477 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:09:22.831518 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:09:22.831547 23375 sgd_solver.cpp:105] Iteration 26200, lr = 1e-05 I0718 06:10:45.098703 23375 solver.cpp:218] Iteration 26300 (1.21553 iter/s, 82.269s/100 iters), loss = 87.3365 I0718 06:10:45.098937 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:10:45.098960 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:10:45.098975 23375 sgd_solver.cpp:105] Iteration 26300, lr = 1e-05 I0718 06:12:07.472759 23375 solver.cpp:218] Iteration 26400 (1.21395 iter/s, 82.3756s/100 iters), loss = 87.3365 I0718 06:12:07.472980 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:12:07.473023 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:12:07.473049 23375 sgd_solver.cpp:105] Iteration 26400, lr = 1e-05 I0718 06:13:29.755089 23375 solver.cpp:218] Iteration 26500 (1.21531 iter/s, 82.2838s/100 iters), loss = 87.3365 I0718 06:13:29.755290 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:13:29.755340 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:13:29.755373 23375 sgd_solver.cpp:105] Iteration 26500, lr = 1e-05 I0718 06:14:52.034380 23375 solver.cpp:218] Iteration 26600 (1.21535 iter/s, 82.2808s/100 iters), loss = 87.3365 I0718 06:14:52.034575 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:14:52.034623 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:14:52.034652 23375 sgd_solver.cpp:105] Iteration 26600, lr = 1e-05 I0718 06:16:14.429927 23375 solver.cpp:218] Iteration 26700 (1.21363 iter/s, 82.3971s/100 iters), loss = 87.3365 I0718 06:16:14.430156 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:16:14.430191 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:16:14.430219 23375 sgd_solver.cpp:105] Iteration 26700, lr = 1e-05 I0718 06:17:36.700991 23375 solver.cpp:218] Iteration 26800 (1.21547 iter/s, 82.2726s/100 iters), loss = 87.3365 I0718 06:17:36.701217 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:17:36.701267 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:17:36.701297 23375 sgd_solver.cpp:105] Iteration 26800, lr = 1e-05 I0718 06:18:58.985504 23375 solver.cpp:218] Iteration 26900 (1.21527 iter/s, 82.286s/100 iters), loss = 87.3365 I0718 06:18:58.985733 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:18:58.985786 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:18:58.985828 23375 sgd_solver.cpp:105] Iteration 26900, lr = 1e-05 I0718 06:20:21.300926 23375 solver.cpp:218] Iteration 27000 (1.21482 iter/s, 82.3169s/100 iters), loss = 87.3365 I0718 06:20:21.301146 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:20:21.301185 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:20:21.301211 23375 sgd_solver.cpp:105] Iteration 27000, lr = 1e-05 I0718 06:21:43.529839 23375 solver.cpp:218] Iteration 27100 (1.21609 iter/s, 82.2304s/100 iters), loss = 87.3365 I0718 06:21:43.530045 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:21:43.530097 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:21:43.530133 23375 sgd_solver.cpp:105] Iteration 27100, lr = 1e-05 I0718 06:23:05.944376 23375 solver.cpp:218] Iteration 27200 (1.21336 iter/s, 82.4161s/100 iters), loss = 87.3365 I0718 06:23:05.944566 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:23:05.944604 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:23:05.944628 23375 sgd_solver.cpp:105] Iteration 27200, lr = 1e-05 I0718 06:24:28.162952 23375 solver.cpp:218] Iteration 27300 (1.21625 iter/s, 82.2201s/100 iters), loss = 87.3365 I0718 06:24:28.163229 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:24:28.163264 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:24:28.163288 23375 sgd_solver.cpp:105] Iteration 27300, lr = 1e-05 I0718 06:25:50.370781 23375 solver.cpp:218] Iteration 27400 (1.21641 iter/s, 82.2093s/100 iters), loss = 87.3365 I0718 06:25:50.370991 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:25:50.371028 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:25:50.371053 23375 sgd_solver.cpp:105] Iteration 27400, lr = 1e-05 I0718 06:27:12.708086 23375 solver.cpp:218] Iteration 27500 (1.21449 iter/s, 82.3388s/100 iters), loss = 87.3365 I0718 06:27:12.708300 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:27:12.708359 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:27:12.708389 23375 sgd_solver.cpp:105] Iteration 27500, lr = 1e-05 I0718 06:28:34.971388 23375 solver.cpp:218] Iteration 27600 (1.21559 iter/s, 82.2648s/100 iters), loss = 87.3365 I0718 06:28:34.971621 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:28:34.971657 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:28:34.971693 23375 sgd_solver.cpp:105] Iteration 27600, lr = 1e-05 I0718 06:29:57.238597 23375 solver.cpp:218] Iteration 27700 (1.21553 iter/s, 82.2687s/100 iters), loss = 87.3365 I0718 06:29:57.238765 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:29:57.238781 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:29:57.238793 23375 sgd_solver.cpp:105] Iteration 27700, lr = 1e-05 I0718 06:31:19.551584 23375 solver.cpp:218] Iteration 27800 (1.21485 iter/s, 82.3145s/100 iters), loss = 87.3365 I0718 06:31:19.551820 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:31:19.551858 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:31:19.551884 23375 sgd_solver.cpp:105] Iteration 27800, lr = 1e-05 I0718 06:32:41.778841 23375 solver.cpp:218] Iteration 27900 (1.21612 iter/s, 82.2288s/100 iters), loss = 87.3365 I0718 06:32:41.779072 23375 solver.cpp:237] Train net output #0: lambda = 5 I0718 06:32:41.779124 23375 solver.cpp:237] Train net output #1: softmax_loss = 87.3365 ( 1 = 87.3365 loss) I0718 06:32:41.779165 23375 sgd_solver.cpp:105] Iteration 27900, lr = 1e-05 I0718 06:34:03.251044 23375 solver.cpp:447] Snapshotting to binary proto file result/sphereface_model_iter_28000.caffemodel I0718 06:34:03.783634 23375 sgd_solver.cpp:273] Snapshotting solver state to binary proto file result/sphereface_model_iter_28000.solverstate I0718 06:34:04.175076 23375 solver.cpp:310] Iteration 28000, loss = 87.3365 I0718 06:34:04.175137 23375 solver.cpp:315] Optimization Done. I0718 06:34:04.322540 23375 caffe.cpp:259] Optimization Done.

`

yxchng commented 6 years ago

Same here. Why is it this special number 87.3365 but not NaN?

jimeffry commented 6 years ago

@Laviyy You could try to set the value of gamma to 0.0002.

jimeffry commented 6 years ago

@Laviyy Another tip is setting the value of 'type' to SINGLE, at the first.

jangho2001us commented 5 years ago

@Laviyy @yxchng Hi, I encountered the same issue. How did you solve this problem?

yxchng commented 5 years ago

@jangho2001us It basically just mean that you have met the gradient vanishing problem. Tune your learning rate or change the margin to a smaller one like what @jimeffry mention.