mahyarnajibi / SSH

SSH: Single Stage Headless Face Detector
Other
835 stars 280 forks source link

Several bugs may exist #16

Closed mxmxlwlw closed 6 years ago

mxmxlwlw commented 6 years ago

Hi, Here's some problems may exist.

  1. The total loss doesn't equal to the sum of all loss when training.
  2. the reg_loss of m3 is 0 while the cls_loss of m3 is not. Can you explain these?
mahyarnajibi commented 6 years ago

Hi, As mentioned in the paper the regression is solved only for anchors which are assigned to the positive class. The case you are mentioning is when all anchors in M3 are negative, so no regression is performed but the classifier is trained to classify them as negatives also the shown loss is the average loss over 100 iterations. Are there any specific bugs that you could find? Thanks!

mxmxlwlw commented 6 years ago

Thank you! This framework do can train successful, however I can't just watch the loss to judge whether the network is converge. Can you give me some advice?

mahyarnajibi commented 6 years ago

Sure, yes it is harder to judge purely based on the loss. Maybe a better approach is to monitor the accuracy on a small subset of the validation set.

mxmxlwlw commented 6 years ago

Thank you! This framework works great.

foralliance commented 6 years ago

@mxmxlwlw @mahyarnajibi about the one problem: The total loss doesn't equal to the sum of all loss when training. How do you understand it, for example:

I0715 11:49:01.657114 22342 solver.cpp:218] Iteration 0 (1.36292e+09 iter/s, 0.770666s/20 iters), loss = 3.05122
I0715 11:49:01.657145 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.779361 (* 1 = 0.779361 loss)
I0715 11:49:01.657150 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0.0107608 (* 1 = 0.0107608 loss)
I0715 11:49:01.657153 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 1.15475 (* 1 = 1.15475 loss)
I0715 11:49:01.657156 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0.00992704 (* 1 = 0.00992704 loss)
I0715 11:49:01.657160 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 0.998615 (* 1 = 0.998615 loss)
I0715 11:49:01.657163 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:01.657187 22342 sgd_solver.cpp:105] Iteration 0, lr = 0.004
I0715 11:49:15.194633 22342 solver.cpp:218] Iteration 20 (1.47743 iter/s, 13.537s/20 iters), loss = 0.998297
I0715 11:49:15.194665 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.00731592 (* 1 = 0.00731592 loss)
I0715 11:49:15.194670 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:15.194674 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 0.593696 (* 1 = 0.593696 loss)
I0715 11:49:15.194677 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0.0508108 (* 1 = 0.0508108 loss)
I0715 11:49:15.194680 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 0.0235499 (* 1 = 0.0235499 loss)
I0715 11:49:15.194684 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:15.194707 22342 sgd_solver.cpp:105] Iteration 20, lr = 0.004
I0715 11:49:28.997684 22342 solver.cpp:218] Iteration 40 (1.44901 iter/s, 13.8026s/20 iters), loss = 0.853168
I0715 11:49:28.997712 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.662872 (* 1 = 0.662872 loss)
I0715 11:49:28.997717 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0.0615177 (* 1 = 0.0615177 loss)
I0715 11:49:28.997720 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 0.0558777 (* 1 = 0.0558777 loss)
I0715 11:49:28.997725 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:28.997726 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 0.0348864 (* 1 = 0.0348864 loss)
I0715 11:49:28.997730 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:28.997743 22342 sgd_solver.cpp:105] Iteration 40, lr = 0.004
I0715 11:49:41.680562 22342 solver.cpp:218] Iteration 60 (1.57698 iter/s, 12.6825s/20 iters), loss = 1.17235
I0715 11:49:41.680591 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.23865 (* 1 = 0.23865 loss)
I0715 11:49:41.680596 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:41.680599 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 0.469938 (* 1 = 0.469938 loss)
I0715 11:49:41.680603 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0.0343848 (* 1 = 0.0343848 loss)
I0715 11:49:41.680605 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 0.0121342 (* 1 = 0.0121342 loss)
I0715 11:49:41.680608 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:41.680622 22342 sgd_solver.cpp:105] Iteration 60, lr = 0.004
I0715 11:49:54.220598 22342 solver.cpp:218] Iteration 80 (1.59494 iter/s, 12.5396s/20 iters), loss = 1.03238
I0715 11:49:54.220628 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.0961151 (* 1 = 0.0961151 loss)
I0715 11:49:54.220633 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:54.220636 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 0.0406756 (* 1 = 0.0406756 loss)
I0715 11:49:54.220639 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:49:54.220643 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 1.01558 (* 1 = 1.01558 loss)
I0715 11:49:54.220645 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0.224902 (* 1 = 0.224902 loss)
I0715 11:49:54.220659 22342 sgd_solver.cpp:105] Iteration 80, lr = 0.004
I0715 11:50:06.850126 22342 solver.cpp:218] Iteration 100 (1.58364 iter/s, 12.6291s/20 iters), loss = 0.336588
I0715 11:50:06.850155 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.119277 (* 1 = 0.119277 loss)
I0715 11:50:06.850162 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:50:06.850164 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 0.243431 (* 1 = 0.243431 loss)
I0715 11:50:06.850167 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0.00925399 (* 1 = 0.00925399 loss)
I0715 11:50:06.850172 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 0.031974 (* 1 = 0.031974 loss)
I0715 11:50:06.850174 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:50:06.850189 22342 sgd_solver.cpp:105] Iteration 100, lr = 0.004
I0715 11:50:19.686389 22342 solver.cpp:218] Iteration 120 (1.55814 iter/s, 12.8358s/20 iters), loss = 0.725993
I0715 11:50:19.686419 22342 solver.cpp:237]     Train net output #0: m1@ssh_cls_loss = 0.0501123 (* 1 = 0.0501123 loss)
I0715 11:50:19.686424 22342 solver.cpp:237]     Train net output #1: m1@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:50:19.686426 22342 solver.cpp:237]     Train net output #2: m2@ssh_cls_loss = 0.0800183 (* 1 = 0.0800183 loss)
I0715 11:50:19.686429 22342 solver.cpp:237]     Train net output #3: m2@ssh_reg_loss = 0 (* 1 = 0 loss)
I0715 11:50:19.686432 22342 solver.cpp:237]     Train net output #4: m3@ssh_cls_loss = 0.115961 (* 1 = 0.115961 loss)
I0715 11:50:19.686435 22342 solver.cpp:237]     Train net output #5: m3@ssh_reg_loss = 0.00521535 (* 1 = 0.00521535 loss)