christopher5106 / last_caffe_with_stn

Spatial Transformer Networks
Other
76 stars 42 forks source link

the mnist stn test loss not decrease #3

Closed hawklucky closed 7 years ago

hawklucky commented 8 years ago

I update the caffe and also clone your this version, all goes that the mnist loss cannot decrease, the way creating clusttered mnist data follow your blog tutorial :

git clone https://github.com/christopher5106/mnist-cluttered cd mnist-cluttered luajit download_mnist.lua mkdir -p {0..9} luajit save_to_file.lua for i in {0..9}; do for p in /home/ubuntu/mnist-cluttered/$i/*; do echo $p $i >> mnist.txt; done ; done

the training result goes

I0818 22:31:31.895613 2998 net.cpp:693] Ignoring source layer accuracy-train I0818 22:31:31.895637 2998 blocking_queue.cpp:50] Data layer prefetch queue empty I0818 22:31:50.731997 2998 solver.cpp:404] Test net output #0: accuracy = 0.092 I0818 22:31:50.732033 2998 solver.cpp:404] Test net output #1: loss_cls = 2.30615 (* 1 = 2.30615 loss) I0818 22:31:50.735200 2998 solver.cpp:228] Iteration 0, loss = 2.28729 I0818 22:31:50.735222 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:31:50.735234 2998 solver.cpp:244] Train net output #1: loss_cls = 2.28729 (* 1 = 2.28729 loss) I0818 22:31:50.735247 2998 sgd_solver.cpp:106] Iteration 0, lr = 0.01 I0818 22:32:16.274798 2998 solver.cpp:228] Iteration 100, loss = 2.31437 I0818 22:32:16.274925 2998 solver.cpp:244] Train net output #0: accuracy = 0 I0818 22:32:16.274940 2998 solver.cpp:244] Train net output #1: loss_cls = 2.31437 (* 1 = 2.31437 loss) I0818 22:32:16.274947 2998 sgd_solver.cpp:106] Iteration 100, lr = 0.01 I0818 22:32:48.199566 2998 solver.cpp:228] Iteration 200, loss = 2.30534 I0818 22:32:48.199647 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:32:48.199672 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30534 (* 1 = 2.30534 loss) I0818 22:32:48.199679 2998 sgd_solver.cpp:106] Iteration 200, lr = 0.01 I0818 22:33:09.883466 2998 solver.cpp:228] Iteration 300, loss = 2.30259 I0818 22:33:09.883492 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:33:09.883500 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:33:09.883508 2998 sgd_solver.cpp:106] Iteration 300, lr = 0.01 I0818 22:33:32.281954 2998 solver.cpp:228] Iteration 400, loss = 2.30384 I0818 22:33:32.282045 2998 solver.cpp:244] Train net output #0: accuracy = 0.2 I0818 22:33:32.282055 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30384 (* 1 = 2.30384 loss) I0818 22:33:32.282061 2998 sgd_solver.cpp:106] Iteration 400, lr = 0.01 I0818 22:33:55.271708 2998 solver.cpp:228] Iteration 500, loss = 2.30259 I0818 22:33:55.271777 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:33:55.271808 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:33:55.271832 2998 sgd_solver.cpp:106] Iteration 500, lr = 0.01 I0818 22:34:14.835242 2998 solver.cpp:228] Iteration 600, loss = 2.30259 I0818 22:34:14.835330 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:34:14.835348 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:34:14.835357 2998 sgd_solver.cpp:106] Iteration 600, lr = 0.01 I0818 22:34:37.795267 2998 solver.cpp:228] Iteration 700, loss = 2.30259 I0818 22:34:37.795295 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:34:37.795305 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:34:37.795313 2998 sgd_solver.cpp:106] Iteration 700, lr = 0.01 I0818 22:34:56.994974 2998 solver.cpp:228] Iteration 800, loss = 2.30259 I0818 22:34:56.995132 2998 solver.cpp:244] Train net output #0: accuracy = 0 I0818 22:34:56.995156 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:34:56.995168 2998 sgd_solver.cpp:106] Iteration 800, lr = 0.01 I0818 22:35:18.685566 2998 solver.cpp:228] Iteration 900, loss = 2.30259 I0818 22:35:18.685590 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:35:18.685598 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:35:18.685603 2998 sgd_solver.cpp:106] Iteration 900, lr = 0.01 I0818 22:35:18.932432 2998 blocking_queue.cpp:50] Data layer prefetch queue empty I0818 22:35:38.017071 2998 solver.cpp:337] Iteration 1000, Testing net (#0) I0818 22:35:38.017169 2998 net.cpp:693] Ignoring source layer accuracy-train I0818 22:35:58.653934 2998 solver.cpp:404] Test net output #0: accuracy = 0.113 I0818 22:35:58.653972 2998 solver.cpp:404] Test net output #1: loss_cls = 2.30258 (* 1 = 2.30258 loss) I0818 22:35:58.655311 2998 solver.cpp:228] Iteration 1000, loss = 2.30259 I0818 22:35:58.655328 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:35:58.655345 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:35:58.655354 2998 sgd_solver.cpp:106] Iteration 1000, lr = 0.01 I0818 22:36:17.491513 2998 solver.cpp:228] Iteration 1100, loss = 2.30384 I0818 22:36:17.491641 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:36:17.491652 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30384 (* 1 = 2.30384 loss) I0818 22:36:17.491667 2998 sgd_solver.cpp:106] Iteration 1100, lr = 0.01 I0818 22:36:36.980705 2998 solver.cpp:228] Iteration 1200, loss = 2.30259 I0818 22:36:36.980729 2998 solver.cpp:244] Train net output #0: accuracy = 0 I0818 22:36:36.980737 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:36:36.980743 2998 sgd_solver.cpp:106] Iteration 1200, lr = 0.01 I0818 22:36:56.683014 2998 solver.cpp:228] Iteration 1300, loss = 2.30259 I0818 22:36:56.683100 2998 solver.cpp:244] Train net output #0: accuracy = 0.1 I0818 22:36:56.683117 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss) I0818 22:36:56.683125 2998 sgd_solver.cpp:106] Iteration 1300, lr = 0.01 I0818 22:37:16.430347 2998 solver.cpp:228] Iteration 1400, loss = 2.30259 I0818 22:37:16.430373 2998 solver.cpp:244] Train net output #0: accuracy = 0.1
I0818 22:37:16.430382 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss)
I0818 22:37:16.430390 2998 sgd_solver.cpp:106] Iteration 1400, lr = 0.01
I0818 22:37:36.240224 2998 solver.cpp:228] Iteration 1500, loss = 2.30259
I0818 22:37:36.240351 2998 solver.cpp:244] Train net output #0: accuracy = 0.1
I0818 22:37:36.240376 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss)
I0818 22:37:36.240382 2998 sgd_solver.cpp:106] Iteration 1500, lr = 0.01
I0818 22:37:55.047371 2998 solver.cpp:228] Iteration 1600, loss = 2.30259
I0818 22:37:55.047397 2998 solver.cpp:244] Train net output #0: accuracy = 0.3
I0818 22:37:55.047405 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30259 (* 1 = 2.30259 loss)
I0818 22:37:55.047412 2998 sgd_solver.cpp:106] Iteration 1600, lr = 0.01
I0818 22:38:16.799686 2998 solver.cpp:228] Iteration 1700, loss = 2.30356
I0818 22:38:16.799794 2998 solver.cpp:244] Train net output #0: accuracy = 0
I0818 22:38:16.799818 2998 solver.cpp:244] Train net output #1: loss_cls = 2.30356 (* 1 = 2.30356 loss)
I0818 22:38:16.799825 2998 sgd_solver.cpp:106] Iteration 1700, lr = 0.01

robindume commented 7 years ago

I have the same problem, and I think it is about parameter setting. I decrease learning initial rate to 1e-3, step_size=3e4, batch_size = 32, keep the other setting. And I set test_iter=1500(it seems the tutor do not generate test list, so the setting will test on about 20e4 data), run with 4 gpus, test accuracy wiill increase to 0.83 after 9w iterations, the result seems more acceptable. You can try other setting, maybe you can get better result. And by curiosity, I try plain lenet with same setting, the test accuracy increase to 0.76.