Open gitUserGoodLeaner opened 6 years ago
This is not normal. In my experiments, the loss decreased to less than 1.0 after about 10000 iterations when batchsize = 32. Are you using the given training data or your own dataset?
I using the given data and coverted the airplane dataset into LMDB, then starting the training. However the training loss change from 7.0+ up to 300.0+! after about 70000 iterations. I still not clear why this happened. The training setting keep same as the repository provided, just modified the dataset path.
What do you see in src/caffe/util/? rbox_util.cpp and rbox_util.cpp.ship are used in different tasks. I renamed them 19 days ago after I uploaded all the code 21 days ago. So I want to know your code is downloaded before or after this date. If your code is the newest version, I will re-training my network and reply you soon.
@liulei01 My training loss of ship task doesn't decrease and keeps 8.0 ~ 9.0 between iterations of 5000 ~ 20000. I do as the README instruction and used the rbox_util.cpp.ship and set batchsize = 32, use only 1 GPU. Is this loss normal?
@gitUserGoodLeaner Have you figured out the problem?
@gitUserGoodLeaner I encounter the same problem and changing the "base_lr" from 0.001 to 0.0001 works for me.
Thanks, I will try your suggestions.
2018-06-21 21:49 GMT+08:00 chensonglu notifications@github.com:
@gitUserGoodLeaner https://github.com/gitUserGoodLeaner I encounter the same problem and changing the "base_lr" from 0.001 to 0.0001 works for me.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/liulei01/DRBox/issues/3#issuecomment-399109879, or mute the thread https://github.com/notifications/unsubscribe-auth/AgnB8CNx965v9jtc3PY4SGm4_gYpQko5ks5t-6RMgaJpZM4RMGZQ .
@makefile
@liulei01 My training loss of ship task doesn't decrease and keeps 8.0 ~ 9.0 between iterations of 5000 ~ 20000. I do as the README instruction and used the rbox_util.cpp.ship and set batchsize = 32, use only 1 GPU. Is this loss normal? @gitUserGoodLeaner Have you figured out the problem?
I am facing the same problem. The loss remains at around 7.1. Have you found a solution meanwhile?
@liulei01 I am wondering why the the detection loss is so high? Does this normal? 1202 I1225 12:11:46.674068 9085 caffe.cpp:243] Resuming from models/RBOX/Airplane/RBOX_300x300_AIRPLANE_VGG_new/RBOX_AIRPLANE_RBOX_300x300_AIRPLANE_VGG_new_iter_81472.solverstate 1203 I1225 12:11:46.745776 9085 sgd_solver.cpp:356] SGDSolver: restoring history 1204 I1225 12:11:46.810129 9085 caffe.cpp:253] Starting Optimization 1205 I1225 12:11:46.810154 9085 solver.cpp:295] Solving RBOX_AIRPLANE_RBOX_300x300_AIRPLANE_VGG_new_train 1206 I1225 12:11:46.810158 9085 solver.cpp:296] Learning Rate Policy: multistep 1207 I1225 12:11:47.542913 9085 solver.cpp:244] Iteration 81472, loss = 46.3588 1208 I1225 12:11:47.542959 9085 solver.cpp:260] Train net output #0: mbox_loss_plane = 46.3588 ( 1 = 46.3588 loss) 1209 I1225 12:11:47.542980 9085 sgd_solver.cpp:138] Iteration 81472, lr = 2.5e-06 1210 I1225 12:11:48.096568 9085 solver.cpp:244] Iteration 81473, loss = 67.1444 1211 I1225 12:11:48.096614 9085 solver.cpp:260] Train net output #0: mbox_loss_plane = 87.93 ( 1 = 87.93 loss) 1212 I1225 12:11:48.096622 9085 sgd_solver.cpp:138] Iteration 81473, lr = 2.5e-06 1213 I1225 12:11:48.610831 9085 solver.cpp:244] Iteration 81474, loss = 88.3024 1214 I1225 12:11:48.610878 9085 solver.cpp:260] Train net output #0: mbox_loss_plane = 130.618 ( 1 = 130.618 loss) 1215 I1225 12:11:48.610887 9085 sgd_solver.cpp:138] Iteration 81474, lr = 2.5e-06 1216 I1225 12:11:49.130704 9085 solver.cpp:244] Iteration 81475, loss = 111.279 1217 I1225 12:11:49.130739 9085 solver.cpp:260] Train net output #0: mbox_loss_plane = 180.209 ( 1 = 180.209 loss) 1218 I1225 12:11:49.130746 9085 sgd_solver.cpp:138] Iteration 81475, lr = 2.5e-06 1219 I1225 12:11:49.671315 9085 solver.cpp:244] Iteration 81476, loss = 132.512 1220 I1225 12:11:49.671350 9085 solver.cpp:260] Train net output #0: mbox_loss_plane = 217.444 ( 1 = 217.444 loss) 1221 I1225 12:11:49.671357 9085 sgd_solver.cpp:138] Iteration 81476, lr = 2.5e-06 1222 I1225 12:11:50.194638 9085 solver.cpp:244] Iteration 81477, loss = 153.352 1223 I1225 12:11:50.194689 9085 solver.cpp:260] Train net output #0: mbox_loss_plane = 257.552 ( 1 = 257.552 loss)