daerduoCarey / SpatialTransformerLayer

Other
181 stars 97 forks source link

Add the stn after the data layer . #13

Open Usernamezhx opened 7 years ago

Usernamezhx commented 7 years ago

hi daerduoCarey First of all. thanks for sharing your code . I add the stn layer after the data layer:

name: "GoogleNet" layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { mirror: true crop_size: 224 mean_value: 104 mean_value: 117 mean_value: 123 } data_param { source: "/data04/data/img_train_lmdb" batch_size: 64 backend: LMDB } } layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { mirror: false crop_size: 224 mean_value: 104 mean_value: 117 mean_value: 123 } data_param { source: "/data04/data/img_test_lmdb" batch_size: 64 backend: LMDB } }

++++++++++++++++++++++++++++++++++++++++++++++++++++++

layer { name: "loc_conv1" type: "Convolution" bottom: "data" top: "loc_conv1" convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "loc_pool1" type: "Pooling" bottom: "loc_conv1" top: "loc_pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "loc_relu1" type: "ReLU" bottom: "loc_pool1" top: "loc_pool1" } layer { name: "loc_conv2" type: "Convolution" bottom: "loc_pool1" top: "loc_conv2" convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "loc_pool2" type: "Pooling" bottom: "loc_conv2" top: "loc_pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "loc_relu2" type: "ReLU" bottom: "loc_pool2" top: "loc_pool2" } layer { name: "loc_ip1" type: "InnerProduct" bottom: "loc_pool2" top: "loc_ip1" inner_product_param { num_output: 20 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "loc_relu3" type: "ReLU" bottom: "loc_ip1" top: "loc_ip1" } layer { name: "loc_reg" type: "InnerProduct" bottom: "loc_ip1" top: "theta" inner_product_param { num_output: 6 weight_filler { type: "constant" value: 0 } bias_filler { type: "xavier" } } } layer { name: "st_layer" type: "SpatialTransformer" bottom: "data" bottom: "theta" top: "st_output" }

++++++++++++++++++++++++++++++++++++++++++++++++++++++

but when i train for about 5w iteration .it turn out that :the loss and the accuracy is nearly constant:

caffe.txt

I0708 21:54:56.021843 99468 solver.cpp:330] Iteration 48000, Testing net (#0) I0708 22:04:38.987025 99629 data_layer.cpp:73] Restarting data prefetching from start. I0708 22:05:23.801887 99468 solver.cpp:397] Test net output #0: loss1/loss1 = 2.07656 ( 0.3 = 0.622968 loss) I0708 22:05:23.802105 99468 solver.cpp:397] Test net output #1: loss1/top-1 = 0.367581 I0708 22:05:23.802126 99468 solver.cpp:397] Test net output #2: loss1/top-5 = 0.794675 I0708 22:05:23.802213 99468 solver.cpp:397] Test net output #3: loss2/loss2 = 2.07646 ( 0.3 = 0.622937 loss) I0708 22:05:23.802230 99468 solver.cpp:397] Test net output #4: loss2/top-1 = 0.367581 I0708 22:05:23.802247 99468 solver.cpp:397] Test net output #5: loss2/top-5 = 0.794675 I0708 22:05:23.802261 99468 solver.cpp:397] Test net output #6: loss3/loss3 = 2.07925 ( 1 = 2.07925 loss) I0708 22:05:23.802273 99468 solver.cpp:397] Test net output #7: loss3/top-1 = 0.367581 I0708 22:05:23.802286 99468 solver.cpp:397] Test net output #8: loss3/top-5 = 0.794675 I0708 22:05:25.693035 99468 solver.cpp:218] Iteration 48000 (0.0568391 iter/s, 703.74s/40 iters), loss = 3.72934 I0708 22:05:25.693156 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.5964 ( 0.3 = 0.77892 loss) I0708 22:05:25.693183 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.59039 ( 0.3 = 0.777116 loss) I0708 22:05:25.693207 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.59106 ( 1 = 2.59106 loss) I0708 22:05:25.693228 99468 sgd_solver.cpp:105] Iteration 48000, lr = 0.001 I0708 22:06:42.168045 99468 solver.cpp:218] Iteration 48040 (0.523065 iter/s, 76.4724s/40 iters), loss = 3.688 I0708 22:06:42.168287 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.55909 ( 0.3 = 0.767729 loss) I0708 22:06:42.168314 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.55477 ( 0.3 = 0.766431 loss) I0708 22:06:42.168330 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.55073 ( 1 = 2.55073 loss) I0708 22:06:42.168345 99468 sgd_solver.cpp:105] Iteration 48040, lr = 0.001 I0708 22:07:58.747380 99468 solver.cpp:218] Iteration 48080 (0.522353 iter/s, 76.5766s/40 iters), loss = 3.68185 I0708 22:07:58.747668 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.45008 ( 0.3 = 0.735024 loss) I0708 22:07:58.747687 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.45557 ( 0.3 = 0.736671 loss) I0708 22:07:58.747704 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.45918 ( 1 = 2.45918 loss) I0708 22:07:58.747721 99468 sgd_solver.cpp:105] Iteration 48080, lr = 0.001 I0708 22:09:15.278134 99468 solver.cpp:218] Iteration 48120 (0.522685 iter/s, 76.5279s/40 iters), loss = 3.68409 I0708 22:09:15.278367 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.40237 ( 0.3 = 0.720711 loss) I0708 22:09:15.278431 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.40456 ( 0.3 = 0.721368 loss) I0708 22:09:15.278445 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.40145 ( 1 = 2.40145 loss) I0708 22:09:15.278481 99468 sgd_solver.cpp:105] Iteration 48120, lr = 0.001 I0708 22:10:31.786173 99468 solver.cpp:218] Iteration 48160 (0.52284 iter/s, 76.5052s/40 iters), loss = 3.66804 I0708 22:10:31.786469 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.42295 ( 0.3 = 0.726885 loss) I0708 22:10:31.786494 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.42938 ( 0.3 = 0.728814 loss) I0708 22:10:31.786510 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.41936 ( 1 = 2.41936 loss) I0708 22:10:31.786527 99468 sgd_solver.cpp:105] Iteration 48160, lr = 0.001 I0708 22:11:48.292574 99468 solver.cpp:218] Iteration 48200 (0.522851 iter/s, 76.5036s/40 iters), loss = 3.70242 I0708 22:11:48.292825 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.08664 ( 0.3 = 0.625991 loss) I0708 22:11:48.292891 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.09284 ( 0.3 = 0.627853 loss) I0708 22:11:48.292924 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.09257 ( 1 = 2.09257 loss) I0708 22:11:48.292946 99468 sgd_solver.cpp:105] Iteration 48200, lr = 0.001 I0708 22:13:04.789505 99468 solver.cpp:218] Iteration 48240 (0.522916 iter/s, 76.4941s/40 iters), loss = 3.73753 I0708 22:13:04.789747 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.23239 ( 0.3 = 0.669717 loss) I0708 22:13:04.789772 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.22779 ( 0.3 = 0.668337 loss) I0708 22:13:04.789788 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.22377 ( 1 = 2.22377 loss) I0708 22:13:04.789841 99468 sgd_solver.cpp:105] Iteration 48240, lr = 0.001 I0708 22:14:21.402251 99468 solver.cpp:218] Iteration 48280 (0.522125 iter/s, 76.61s/40 iters), loss = 3.66022 I0708 22:14:21.402489 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.02189 ( 0.3 = 0.606567 loss) I0708 22:14:21.402513 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.03129 ( 0.3 = 0.609388 loss) I0708 22:14:21.402529 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.02827 ( 1 = 2.02827 loss) I0708 22:14:21.402590 99468 sgd_solver.cpp:105] Iteration 48280, lr = 0.001 I0708 22:15:37.786306 99468 solver.cpp:218] Iteration 48320 (0.523688 iter/s, 76.3813s/40 iters), loss = 3.73213 I0708 22:15:37.786543 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 1.94553 ( 0.3 = 0.583658 loss) I0708 22:15:37.786576 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 1.94268 ( 0.3 = 0.582803 loss) I0708 22:15:37.786592 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 1.94131 ( 1 = 1.94131 loss) I0708 22:15:37.786607 99468 sgd_solver.cpp:105] Iteration 48320, lr = 0.001 I0708 22:16:54.439471 99468 solver.cpp:218] Iteration 48360 (0.52185 iter/s, 76.6504s/40 iters), loss = 3.73605 I0708 22:16:54.439713 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.6708 ( 0.3 = 0.801241 loss) I0708 22:16:54.439738 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.68361 ( 0.3 = 0.805082 loss) I0708 22:16:54.439795 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.67145 ( 1 = 2.67145 loss) I0708 22:16:54.439812 99468 sgd_solver.cpp:105] Iteration 48360, lr = 0.001 I0708 22:18:10.929617 99468 solver.cpp:218] Iteration 48400 (0.522962 iter/s, 76.4874s/40 iters), loss = 3.72857 I0708 22:18:10.929879 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.22945 ( 0.3 = 0.668834 loss) I0708 22:18:10.929904 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.22525 ( 0.3 = 0.667575 loss) I0708 22:18:10.929922 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.23505 ( 1 = 2.23505 loss) I0708 22:18:10.929936 99468 sgd_solver.cpp:105] Iteration 48400, lr = 0.001 I0708 22:19:27.477816 99468 solver.cpp:218] Iteration 48440 (0.522565 iter/s, 76.5454s/40 iters), loss = 3.74548 I0708 22:19:27.478086 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.23656 ( 0.3 = 0.670967 loss) I0708 22:19:27.478109 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.2374 ( 0.3 = 0.671219 loss) I0708 22:19:27.478124 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.23943 ( 1 = 2.23943 loss) I0708 22:19:27.478139 99468 sgd_solver.cpp:105] Iteration 48440, lr = 0.001 I0708 22:20:44.099797 99468 solver.cpp:218] Iteration 48480 (0.522062 iter/s, 76.6192s/40 iters), loss = 3.66581 I0708 22:20:44.100042 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.40721 ( 0.3 = 0.722163 loss) I0708 22:20:44.100064 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.40097 ( 0.3 = 0.72029 loss) I0708 22:20:44.100078 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.40448 ( 1 = 2.40448 loss) I0708 22:20:44.100098 99468 sgd_solver.cpp:105] Iteration 48480, lr = 0.001 I0708 22:22:00.522353 99468 solver.cpp:218] Iteration 48520 (0.523425 iter/s, 76.4198s/40 iters), loss = 3.76453 I0708 22:22:00.522599 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.68609 ( 0.3 = 0.805827 loss) I0708 22:22:00.522627 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.67747 ( 0.3 = 0.803241 loss) I0708 22:22:00.522644 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.68157 ( 1 = 2.68157 loss) I0708 22:22:00.522661 99468 sgd_solver.cpp:105] Iteration 48520, lr = 0.001 I0708 22:23:17.130800 99468 solver.cpp:218] Iteration 48560 (0.522155 iter/s, 76.6057s/40 iters), loss = 3.69656 I0708 22:23:17.131028 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 1.98202 ( 0.3 = 0.594605 loss) I0708 22:23:17.131055 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 1.97319 ( 0.3 = 0.591958 loss) I0708 22:23:17.131109 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 1.96977 ( 1 = 1.96977 loss) I0708 22:23:17.131126 99468 sgd_solver.cpp:105] Iteration 48560, lr = 0.001 I0708 22:24:33.725183 99468 solver.cpp:218] Iteration 48600 (0.52225 iter/s, 76.5916s/40 iters), loss = 3.70043 I0708 22:24:33.725426 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.20517 ( 0.3 = 0.66155 loss) I0708 22:24:33.725450 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.21122 ( 0.3 = 0.663367 loss) I0708 22:24:33.725467 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.21146 ( 1 = 2.21146 loss) I0708 22:24:33.725486 99468 sgd_solver.cpp:105] Iteration 48600, lr = 0.001 I0708 22:25:50.090847 99468 solver.cpp:218] Iteration 48640 (0.523815 iter/s, 76.3629s/40 iters), loss = 3.67684 I0708 22:25:50.091091 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.3432 ( 0.3 = 0.702961 loss) I0708 22:25:50.091152 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.34685 ( 0.3 = 0.704056 loss) I0708 22:25:50.091166 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.34175 ( 1 = 2.34175 loss) I0708 22:25:50.091182 99468 sgd_solver.cpp:105] Iteration 48640, lr = 0.001 I0708 22:27:06.678912 99468 solver.cpp:218] Iteration 48680 (0.522293 iter/s, 76.5853s/40 iters), loss = 3.65795 I0708 22:27:06.679167 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.22877 ( 0.3 = 0.668631 loss) I0708 22:27:06.679220 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.21793 ( 0.3 = 0.665378 loss) I0708 22:27:06.679235 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.22912 ( 1 = 2.22912 loss) I0708 22:27:06.679253 99468 sgd_solver.cpp:105] Iteration 48680, lr = 0.001 I0708 22:28:23.010001 99468 solver.cpp:218] Iteration 48720 (0.524052 iter/s, 76.3283s/40 iters), loss = 3.72518 I0708 22:28:23.010248 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.09107 ( 0.3 = 0.627322 loss) I0708 22:28:23.010272 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.08621 ( 0.3 = 0.625864 loss) I0708 22:28:23.010321 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.0809 ( 1 = 2.0809 loss) I0708 22:28:23.010339 99468 sgd_solver.cpp:105] Iteration 48720, lr = 0.001 I0708 22:29:39.331192 99468 solver.cpp:218] Iteration 48760 (0.52412 iter/s, 76.3184s/40 iters), loss = 3.68973 I0708 22:29:39.331473 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.2227 ( 0.3 = 0.66681 loss) I0708 22:29:39.331499 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.24941 ( 0.3 = 0.674823 loss) I0708 22:29:39.331519 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.24144 ( 1 = 2.24144 loss) I0708 22:29:39.331534 99468 sgd_solver.cpp:105] Iteration 48760, lr = 0.001 I0708 22:30:55.831094 99468 solver.cpp:218] Iteration 48800 (0.522896 iter/s, 76.4971s/40 iters), loss = 3.67324 I0708 22:30:55.831348 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.21342 ( 0.3 = 0.664027 loss) I0708 22:30:55.831378 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.21621 ( 0.3 = 0.664862 loss) I0708 22:30:55.831395 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.21636 ( 1 = 2.21636 loss) I0708 22:30:55.831419 99468 sgd_solver.cpp:105] Iteration 48800, lr = 0.001 I0708 22:32:12.290940 99468 solver.cpp:218] Iteration 48840 (0.523169 iter/s, 76.4571s/40 iters), loss = 3.72509 I0708 22:32:12.291190 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.02691 ( 0.3 = 0.608074 loss) I0708 22:32:12.291224 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.01526 ( 0.3 = 0.604577 loss) I0708 22:32:12.291283 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.0134 ( 1 = 2.0134 loss) I0708 22:32:12.291302 99468 sgd_solver.cpp:105] Iteration 48840, lr = 0.001 I0708 22:33:28.908922 99468 solver.cpp:218] Iteration 48880 (0.52209 iter/s, 76.6152s/40 iters), loss = 3.6966 I0708 22:33:28.909173 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.07398 ( 0.3 = 0.622195 loss) I0708 22:33:28.909195 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.06225 ( 0.3 = 0.618675 loss) I0708 22:33:28.909211 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.08809 ( 1 = 2.08809 loss) I0708 22:33:28.909232 99468 sgd_solver.cpp:105] Iteration 48880, lr = 0.001 I0708 22:34:45.497153 99468 solver.cpp:218] Iteration 48920 (0.522292 iter/s, 76.5855s/40 iters), loss = 3.66147 I0708 22:34:45.497378 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.10266 ( 0.3 = 0.630798 loss) I0708 22:34:45.497402 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.12432 ( 0.3 = 0.637295 loss) I0708 22:34:45.497417 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.10935 ( 1 = 2.10935 loss) I0708 22:34:45.497438 99468 sgd_solver.cpp:105] Iteration 48920, lr = 0.001 I0708 22:36:01.835325 99468 solver.cpp:218] Iteration 48960 (0.524003 iter/s, 76.3354s/40 iters), loss = 3.71744 I0708 22:36:01.835561 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.2948 ( 0.3 = 0.688441 loss) I0708 22:36:01.835587 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.27955 ( 0.3 = 0.683866 loss) I0708 22:36:01.835603 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.27497 ( 1 = 2.27497 loss) I0708 22:36:01.835618 99468 sgd_solver.cpp:105] Iteration 48960, lr = 0.001 I0708 22:37:18.167001 99468 solver.cpp:218] Iteration 49000 (0.524048 iter/s, 76.3289s/40 iters), loss = 3.70097 I0708 22:37:18.167237 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.43151 ( 0.3 = 0.729453 loss) I0708 22:37:18.167266 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.43268 ( 0.3 = 0.729805 loss) I0708 22:37:18.167320 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.43458 ( 1 = 2.43458 loss) I0708 22:37:18.167341 99468 sgd_solver.cpp:105] Iteration 49000, lr = 0.001 I0708 22:38:34.558938 99468 solver.cpp:218] Iteration 49040 (0.523634 iter/s, 76.3892s/40 iters), loss = 3.74337 I0708 22:38:34.559170 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.30453 ( 0.3 = 0.69136 loss) I0708 22:38:34.559198 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.3018 ( 0.3 = 0.690539 loss) I0708 22:38:34.559252 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.30663 ( 1 = 2.30663 loss) I0708 22:38:34.559268 99468 sgd_solver.cpp:105] Iteration 49040, lr = 0.001 I0708 22:39:50.891646 99468 solver.cpp:218] Iteration 49080 (0.524041 iter/s, 76.33s/40 iters), loss = 3.65079 I0708 22:39:50.891922 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.33304 ( 0.3 = 0.699912 loss) I0708 22:39:50.891955 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.3247 ( 0.3 = 0.697409 loss) I0708 22:39:50.891973 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.32369 ( 1 = 2.32369 loss) I0708 22:39:50.891994 99468 sgd_solver.cpp:105] Iteration 49080, lr = 0.001 I0708 22:41:07.209127 99468 solver.cpp:218] Iteration 49120 (0.524146 iter/s, 76.3147s/40 iters), loss = 3.68461 I0708 22:41:07.209374 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.76688 ( 0.3 = 0.830063 loss) I0708 22:41:07.209403 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.76896 ( 0.3 = 0.830687 loss) I0708 22:41:07.209420 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.77983 ( 1 = 2.77983 loss) I0708 22:41:07.209439 99468 sgd_solver.cpp:105] Iteration 49120, lr = 0.001 I0708 22:42:23.569250 99468 solver.cpp:218] Iteration 49160 (0.523853 iter/s, 76.3574s/40 iters), loss = 3.7469 I0708 22:42:23.569499 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.6264 ( 0.3 = 0.787921 loss) I0708 22:42:23.569567 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.62424 ( 0.3 = 0.787272 loss) I0708 22:42:23.569584 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.6222 ( 1 = 2.6222 loss) I0708 22:42:23.569602 99468 sgd_solver.cpp:105] Iteration 49160, lr = 0.001 I0708 22:43:40.181097 99468 solver.cpp:218] Iteration 49200 (0.522131 iter/s, 76.6091s/40 iters), loss = 3.69516 I0708 22:43:40.181329 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.27227 ( 0.3 = 0.68168 loss) I0708 22:43:40.181352 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.28629 ( 0.3 = 0.685887 loss) I0708 22:43:40.181367 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.2701 ( 1 = 2.2701 loss) I0708 22:43:40.181385 99468 sgd_solver.cpp:105] Iteration 49200, lr = 0.001 I0708 22:44:56.842627 99468 solver.cpp:218] Iteration 49240 (0.521793 iter/s, 76.6588s/40 iters), loss = 3.71517 I0708 22:44:56.842871 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.30889 ( 0.3 = 0.692667 loss) I0708 22:44:56.842898 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.30581 ( 0.3 = 0.691742 loss) I0708 22:44:56.842916 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.29975 ( 1 = 2.29975 loss) I0708 22:44:56.842932 99468 sgd_solver.cpp:105] Iteration 49240, lr = 0.001 I0708 22:46:13.454780 99468 solver.cpp:218] Iteration 49280 (0.522129 iter/s, 76.6094s/40 iters), loss = 3.70305 I0708 22:46:13.455013 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.4077 ( 0.3 = 0.722311 loss) I0708 22:46:13.455044 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.42333 ( 0.3 = 0.726998 loss) I0708 22:46:13.455063 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.42115 ( 1 = 2.42115 loss) I0708 22:46:13.455080 99468 sgd_solver.cpp:105] Iteration 49280, lr = 0.001 I0708 22:47:30.085047 99468 solver.cpp:218] Iteration 49320 (0.522006 iter/s, 76.6275s/40 iters), loss = 3.67179 I0708 22:47:30.085289 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.44605 ( 0.3 = 0.733816 loss) I0708 22:47:30.085316 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.43759 ( 0.3 = 0.731277 loss) I0708 22:47:30.085372 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.4321 ( 1 = 2.4321 loss) I0708 22:47:30.085391 99468 sgd_solver.cpp:105] Iteration 49320, lr = 0.001 I0708 22:48:46.651422 99468 solver.cpp:218] Iteration 49360 (0.522441 iter/s, 76.5636s/40 iters), loss = 3.71437 I0708 22:48:46.651736 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.10309 ( 0.3 = 0.630926 loss) I0708 22:48:46.651793 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.09531 ( 0.3 = 0.628592 loss) I0708 22:48:46.651808 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.10261 ( 1 = 2.10261 loss) I0708 22:48:46.651825 99468 sgd_solver.cpp:105] Iteration 49360, lr = 0.001 I0708 22:50:03.277048 99468 solver.cpp:218] Iteration 49400 (0.522121 iter/s, 76.6105s/40 iters), loss = 3.71008 I0708 22:50:03.277305 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.03409 ( 0.3 = 0.610228 loss) I0708 22:50:03.277333 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.04032 ( 0.3 = 0.612096 loss) I0708 22:50:03.277349 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.03172 ( 1 = 2.03172 loss) I0708 22:50:03.277400 99468 sgd_solver.cpp:105] Iteration 49400, lr = 0.001 I0708 22:51:19.879340 99468 solver.cpp:218] Iteration 49440 (0.522196 iter/s, 76.5995s/40 iters), loss = 3.74579 I0708 22:51:19.879617 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.58194 ( 0.3 = 0.774582 loss) I0708 22:51:19.879662 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.5827 ( 0.3 = 0.774811 loss) I0708 22:51:19.879678 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.58273 ( 1 = 2.58273 loss) I0708 22:51:19.879714 99468 sgd_solver.cpp:105] Iteration 49440, lr = 0.001 I0708 22:52:36.484484 99468 solver.cpp:218] Iteration 49480 (0.522177 iter/s, 76.6023s/40 iters), loss = 3.7025 I0708 22:52:36.484724 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.17316 ( 0.3 = 0.651948 loss) I0708 22:52:36.484776 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.17901 ( 0.3 = 0.653704 loss) I0708 22:52:36.484791 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.17448 ( 1 = 2.17448 loss) I0708 22:52:36.484810 99468 sgd_solver.cpp:105] Iteration 49480, lr = 0.001 I0708 22:53:53.104547 99468 solver.cpp:218] Iteration 49520 (0.522075 iter/s, 76.6173s/40 iters), loss = 3.71452 I0708 22:53:53.105298 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.47554 ( 0.3 = 0.742663 loss) I0708 22:53:53.105357 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.49782 ( 0.3 = 0.749345 loss) I0708 22:53:53.105373 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.48883 ( 1 = 2.48883 loss) I0708 22:53:53.105389 99468 sgd_solver.cpp:105] Iteration 49520, lr = 0.001 I0708 22:55:09.700757 99468 solver.cpp:218] Iteration 49560 (0.522241 iter/s, 76.5929s/40 iters), loss = 3.69482 I0708 22:55:09.700987 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.17838 ( 0.3 = 0.653513 loss) I0708 22:55:09.701007 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.18183 ( 0.3 = 0.654548 loss) I0708 22:55:09.701022 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.18198 ( 1 = 2.18198 loss) I0708 22:55:09.701076 99468 sgd_solver.cpp:105] Iteration 49560, lr = 0.001 I0708 22:56:26.341532 99468 solver.cpp:218] Iteration 49600 (0.521934 iter/s, 76.638s/40 iters), loss = 3.77103 I0708 22:56:26.341773 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.3831 ( 0.3 = 0.71493 loss) I0708 22:56:26.341799 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.3863 ( 0.3 = 0.715891 loss) I0708 22:56:26.341848 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.37855 ( 1 = 2.37855 loss) I0708 22:56:26.341866 99468 sgd_solver.cpp:105] Iteration 49600, lr = 0.001 I0708 22:57:42.854746 99468 solver.cpp:218] Iteration 49640 (0.522804 iter/s, 76.5104s/40 iters), loss = 3.70114 I0708 22:57:42.854979 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.40168 ( 0.3 = 0.720505 loss) I0708 22:57:42.855006 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.39229 ( 0.3 = 0.717686 loss) I0708 22:57:42.855057 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.38557 ( 1 = 2.38557 loss) I0708 22:57:42.855073 99468 sgd_solver.cpp:105] Iteration 49640, lr = 0.001 I0708 22:58:59.254925 99468 solver.cpp:218] Iteration 49680 (0.523578 iter/s, 76.3974s/40 iters), loss = 3.71079 I0708 22:58:59.255218 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.21058 ( 0.3 = 0.663174 loss) I0708 22:58:59.255275 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.20215 ( 0.3 = 0.660644 loss) I0708 22:58:59.255288 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.20635 ( 1 = 2.20635 loss) I0708 22:58:59.255307 99468 sgd_solver.cpp:105] Iteration 49680, lr = 0.001 I0708 23:00:15.853015 99468 solver.cpp:218] Iteration 49720 (0.522225 iter/s, 76.5953s/40 iters), loss = 3.71502 I0708 23:00:15.853281 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.07621 ( 0.3 = 0.622863 loss) I0708 23:00:15.853307 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.07954 ( 0.3 = 0.623863 loss) I0708 23:00:15.853322 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.07036 ( 1 = 2.07036 loss) I0708 23:00:15.853374 99468 sgd_solver.cpp:105] Iteration 49720, lr = 0.001 I0708 23:01:32.466379 99468 solver.cpp:218] Iteration 49760 (0.522121 iter/s, 76.6106s/40 iters), loss = 3.65164 I0708 23:01:32.466627 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.39696 ( 0.3 = 0.719087 loss) I0708 23:01:32.466687 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.39385 ( 0.3 = 0.718154 loss) I0708 23:01:32.466703 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.39283 ( 1 = 2.39283 loss) I0708 23:01:32.466722 99468 sgd_solver.cpp:105] Iteration 49760, lr = 0.001 I0708 23:02:48.986902 99468 solver.cpp:218] Iteration 49800 (0.522754 iter/s, 76.5178s/40 iters), loss = 3.66991 I0708 23:02:48.987139 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.75705 ( 0.3 = 0.827115 loss) I0708 23:02:48.987197 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.77391 ( 0.3 = 0.832172 loss) I0708 23:02:48.987213 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.75764 ( 1 = 2.75764 loss) I0708 23:02:48.987232 99468 sgd_solver.cpp:105] Iteration 49800, lr = 0.001 I0708 23:04:05.664381 99468 solver.cpp:218] Iteration 49840 (0.521685 iter/s, 76.6747s/40 iters), loss = 3.68761 I0708 23:04:05.664650 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.27643 ( 0.3 = 0.682929 loss) I0708 23:04:05.664675 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.26933 ( 0.3 = 0.6808 loss) I0708 23:04:05.664691 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.28729 ( 1 = 2.28729 loss) I0708 23:04:05.664744 99468 sgd_solver.cpp:105] Iteration 49840, lr = 0.001 I0708 23:05:22.255897 99468 solver.cpp:218] Iteration 49880 (0.52227 iter/s, 76.5887s/40 iters), loss = 3.70475 I0708 23:05:22.256137 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.23909 ( 0.3 = 0.671728 loss) I0708 23:05:22.256193 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.23169 ( 0.3 = 0.669507 loss) I0708 23:05:22.256209 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.22051 ( 1 = 2.22051 loss) I0708 23:05:22.256227 99468 sgd_solver.cpp:105] Iteration 49880, lr = 0.001 I0708 23:06:38.901861 99468 solver.cpp:218] Iteration 49920 (0.521899 iter/s, 76.6432s/40 iters), loss = 3.74522 I0708 23:06:38.902092 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.34477 ( 0.3 = 0.70343 loss) I0708 23:06:38.902117 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.3378 ( 0.3 = 0.70134 loss) I0708 23:06:38.902164 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.34873 ( 1 = 2.34873 loss) I0708 23:06:38.902181 99468 sgd_solver.cpp:105] Iteration 49920, lr = 0.001 I0708 23:07:55.430269 99468 solver.cpp:218] Iteration 49960 (0.5227 iter/s, 76.5257s/40 iters), loss = 3.69629 I0708 23:07:55.430505 99468 solver.cpp:237] Train net output #0: loss1/loss1 = 2.42696 ( 0.3 = 0.728088 loss) I0708 23:07:55.430529 99468 solver.cpp:237] Train net output #1: loss2/loss2 = 2.43421 ( 0.3 = 0.730264 loss) I0708 23:07:55.430546 99468 solver.cpp:237] Train net output #2: loss3/loss3 = 2.43831 (* 1 = 2.43831 loss) I0708 23:07:55.430569 99468 sgd_solver.cpp:105] Iteration 49960, lr = 0.001

daerduoCarey commented 7 years ago

Hi, @Usernamezhx ,

I think you didn't get the theta right. You should not initialize the weight to all zeros for that layer. Also, you should set the bias constant to make sure the initial thetas form identity matrix. You can always visualize what you got from the transformation after the st layer.

Bests, Kaichun

twmht commented 6 years ago

@daerduoCarey

why do we need to set initial thetas to identity matrix? what is the math intuition behind that?