Closed felixfuu closed 4 years ago
Hi, @felixfuu me too,I don't know why. Have you already solve this problem ?
Have you done the correct data preprocessing steps? You should follow this process https://github.com/wy1iu/sphereface#part-1-preprocessing Because the training uses weak data enhancement, the preprocessing of training data should be exactly the same as the test data
@MuggleWang I had done the data preprocessing steps , and my eval accuracy on LFW is 0.9918 when s=30, m=0.40.
The final result is not certain, and the accuracy will fluctuate. 99.18% is very close to 99.2%. If you want a significant performance improvement, you should try better database and deeper networks
@MuggleWang ok, tranks for your reply!
@felixfuu Hi,Hi, I would like to ask, in this program, how many times have you reached 99.18% of epoches,I did the same preprocessing, but still only 97.7 acc, I don't know why. I look forward to your reply. thank you best wishes
@Hedlen ~20epoch
@felixfuu OK,thanks for your reply! Have you made any other improvements to code or parameters beyond data preprocessing? Why I used the original code, did not achieve the effect of 99.18. I look forward to your replay! best wishes
@Hedlen Can you upload your training log?
@felixfuu I will run again and wait for it to finish. I will upload it again. Thank you!
You should ensure ('--num_class', default=10572) is equal to the number of ids in your data set You can train using the script ’train.sh‘ which will automatically save the training log
@felixfuu @MuggleWang Hi,This is my log file, the only difference with you is that the batchsize is 512, then I ran with 8 GPUs. There is also data preprocessing I used to deal with MTCNN on my own side, only 10533 categories remain after preprocessing data. Thanks !
DataParallel( (module): sphere20( (conv1_1): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (relu1_1): PReLU(num_parameters=64) (conv1_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu1_2): PReLU(num_parameters=64) (conv1_3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu1_3): PReLU(num_parameters=64) (conv2_1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (relu2_1): PReLU(num_parameters=128) (conv2_2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu2_2): PReLU(num_parameters=128) (conv2_3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu2_3): PReLU(num_parameters=128) (conv2_4): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu2_4): PReLU(num_parameters=128) (conv2_5): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu2_5): PReLU(num_parameters=128) (conv3_1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (relu3_1): PReLU(num_parameters=256) (conv3_2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_2): PReLU(num_parameters=256) (conv3_3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_3): PReLU(num_parameters=256) (conv3_4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_4): PReLU(num_parameters=256) (conv3_5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_5): PReLU(num_parameters=256) (conv3_6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_6): PReLU(num_parameters=256) (conv3_7): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_7): PReLU(num_parameters=256) (conv3_8): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_8): PReLU(num_parameters=256) (conv3_9): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu3_9): PReLU(num_parameters=256) (conv4_1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (relu4_1): PReLU(num_parameters=512) (conv4_2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu4_2): PReLU(num_parameters=512) (conv4_3): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (relu4_3): PReLU(num_parameters=512) (fc5): Linear(in_features=21504, out_features=512, bias=True) ) ) length of train Dataset: 494075 Number of Classses: 10533 2018-06-30 16:30:33 Epoch 1 start training 2018-06-30 16:47:36 Train Epoch: 1 [51200/494075 (10%)]100, Loss: 24.078119, Elapsed time: 1017.5625s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 17:03:26 Train Epoch: 1 [102400/494075 (21%)]200, Loss: 22.916864, Elapsed time: 949.8562s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 17:18:21 Train Epoch: 1 [153600/494075 (31%)]300, Loss: 22.675001, Elapsed time: 894.7411s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 17:37:00 Train Epoch: 1 [204800/494075 (41%)]400, Loss: 22.358681, Elapsed time: 1119.5217s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 17:54:54 Train Epoch: 1 [256000/494075 (52%)]500, Loss: 22.164404, Elapsed time: 1072.2939s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 18:13:16 Train Epoch: 1 [307200/494075 (62%)]600, Loss: 21.987621, Elapsed time: 1101.7927s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 18:32:00 Train Epoch: 1 [358400/494075 (73%)]700, Loss: 21.820126, Elapsed time: 1123.7944s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 18:49:03 Train Epoch: 1 [409600/494075 (83%)]800, Loss: 21.611637, Elapsed time: 1023.0687s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 19:07:22 Train Epoch: 1 [460800/494075 (93%)]900, Loss: 21.311731, Elapsed time: 1097.5180s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.7383 std=0.0100 thd=0.7080 2018-06-30 19:30:29 Epoch 2 start training 2018-06-30 19:48:32 Train Epoch: 2 [51200/494075 (10%)]1064, Loss: 20.877731, Elapsed time: 1082.4645s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 20:05:48 Train Epoch: 2 [102400/494075 (21%)]1164, Loss: 20.722500, Elapsed time: 1036.0451s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 20:22:52 Train Epoch: 2 [153600/494075 (31%)]1264, Loss: 20.434168, Elapsed time: 1023.5542s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 20:38:46 Train Epoch: 2 [204800/494075 (41%)]1364, Loss: 20.153643, Elapsed time: 953.5407s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 20:55:52 Train Epoch: 2 [256000/494075 (52%)]1464, Loss: 19.850628, Elapsed time: 1026.1738s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 21:13:41 Train Epoch: 2 [307200/494075 (62%)]1564, Loss: 19.515253, Elapsed time: 1069.1853s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 21:30:50 Train Epoch: 2 [358400/494075 (73%)]1664, Loss: 19.187822, Elapsed time: 1028.6822s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 21:47:42 Train Epoch: 2 [409600/494075 (83%)]1764, Loss: 18.821505, Elapsed time: 1011.6953s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 22:03:46 Train Epoch: 2 [460800/494075 (93%)]1864, Loss: 18.470320, Elapsed time: 964.2057s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.8618 std=0.0205 thd=0.4700 2018-06-30 22:26:12 Epoch 3 start training 2018-06-30 22:43:52 Train Epoch: 3 [51200/494075 (10%)]2028, Loss: 17.719166, Elapsed time: 1059.5503s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 22:59:24 Train Epoch: 3 [102400/494075 (21%)]2128, Loss: 17.440926, Elapsed time: 932.1215s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 23:16:08 Train Epoch: 3 [153600/494075 (31%)]2228, Loss: 17.183744, Elapsed time: 1003.1890s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 23:34:19 Train Epoch: 3 [204800/494075 (41%)]2328, Loss: 16.789606, Elapsed time: 1090.8768s(100 iters) Margin: 0.4000, Scale: 30.00 2018-06-30 23:51:34 Train Epoch: 3 [256000/494075 (52%)]2428, Loss: 16.405176, Elapsed time: 1033.9495s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 00:08:57 Train Epoch: 3 [307200/494075 (62%)]2528, Loss: 16.035944, Elapsed time: 1043.2921s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 00:26:29 Train Epoch: 3 [358400/494075 (73%)]2628, Loss: 15.757998, Elapsed time: 1052.2295s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 00:44:12 Train Epoch: 3 [409600/494075 (83%)]2728, Loss: 15.360889, Elapsed time: 1062.5782s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 00:59:44 Train Epoch: 3 [460800/494075 (93%)]2828, Loss: 15.050651, Elapsed time: 931.5095s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9217 std=0.0143 thd=0.3825 2018-07-01 01:25:23 Epoch 4 start training 2018-07-01 01:42:58 Train Epoch: 4 [51200/494075 (10%)]2992, Loss: 14.225386, Elapsed time: 1055.2715s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 01:58:10 Train Epoch: 4 [102400/494075 (21%)]3092, Loss: 14.115221, Elapsed time: 911.3216s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 02:15:48 Train Epoch: 4 [153600/494075 (31%)]3192, Loss: 13.882771, Elapsed time: 1057.7205s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 02:32:31 Train Epoch: 4 [204800/494075 (41%)]3292, Loss: 13.742986, Elapsed time: 1003.2996s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 02:49:50 Train Epoch: 4 [256000/494075 (52%)]3392, Loss: 13.512514, Elapsed time: 1038.7083s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 03:07:50 Train Epoch: 4 [307200/494075 (62%)]3492, Loss: 13.376544, Elapsed time: 1079.1581s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 03:25:22 Train Epoch: 4 [358400/494075 (73%)]3592, Loss: 13.167745, Elapsed time: 1052.2830s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 03:42:16 Train Epoch: 4 [409600/494075 (83%)]3692, Loss: 12.973400, Elapsed time: 1013.6401s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 03:57:00 Train Epoch: 4 [460800/494075 (93%)]3792, Loss: 12.848106, Elapsed time: 883.4403s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9553 std=0.0101 thd=0.3700 2018-07-01 04:18:48 Epoch 5 start training 2018-07-01 04:36:17 Train Epoch: 5 [51200/494075 (10%)]3956, Loss: 12.041685, Elapsed time: 1048.4976s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 04:53:53 Train Epoch: 5 [102400/494075 (21%)]4056, Loss: 12.091580, Elapsed time: 1055.5864s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 05:10:48 Train Epoch: 5 [153600/494075 (31%)]4156, Loss: 12.049713, Elapsed time: 1014.9881s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 05:27:02 Train Epoch: 5 [204800/494075 (41%)]4256, Loss: 11.946115, Elapsed time: 972.5478s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 05:44:23 Train Epoch: 5 [256000/494075 (52%)]4356, Loss: 11.854580, Elapsed time: 1040.7035s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 05:59:38 Train Epoch: 5 [307200/494075 (62%)]4456, Loss: 11.751221, Elapsed time: 914.4221s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 06:16:25 Train Epoch: 5 [358400/494075 (73%)]4556, Loss: 11.675735, Elapsed time: 1007.0036s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 06:33:06 Train Epoch: 5 [409600/494075 (83%)]4656, Loss: 11.566358, Elapsed time: 1000.1372s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 06:49:33 Train Epoch: 5 [460800/494075 (93%)]4756, Loss: 11.419089, Elapsed time: 986.6182s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9592 std=0.0098 thd=0.3180 2018-07-01 07:11:53 Epoch 6 start training 2018-07-01 07:29:18 Train Epoch: 6 [51200/494075 (10%)]4920, Loss: 10.677783, Elapsed time: 1044.2735s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 07:46:24 Train Epoch: 6 [102400/494075 (21%)]5020, Loss: 10.817953, Elapsed time: 1025.8789s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 08:02:15 Train Epoch: 6 [153600/494075 (31%)]5120, Loss: 10.815484, Elapsed time: 950.8403s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 08:18:03 Train Epoch: 6 [204800/494075 (41%)]5220, Loss: 10.850904, Elapsed time: 947.7339s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 08:35:08 Train Epoch: 6 [256000/494075 (52%)]5320, Loss: 10.672034, Elapsed time: 1025.1260s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 08:52:11 Train Epoch: 6 [307200/494075 (62%)]5420, Loss: 10.686956, Elapsed time: 1023.3611s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 09:09:33 Train Epoch: 6 [358400/494075 (73%)]5520, Loss: 10.582147, Elapsed time: 1041.7183s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 09:26:14 Train Epoch: 6 [409600/494075 (83%)]5620, Loss: 10.539289, Elapsed time: 1001.3167s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 09:43:18 Train Epoch: 6 [460800/494075 (93%)]5720, Loss: 10.451574, Elapsed time: 1023.4293s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9665 std=0.0082 thd=0.3110 2018-07-01 10:05:00 Epoch 7 start training 2018-07-01 10:20:46 Train Epoch: 7 [51200/494075 (10%)]5884, Loss: 9.786712, Elapsed time: 945.9275s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 10:36:52 Train Epoch: 7 [102400/494075 (21%)]5984, Loss: 9.890993, Elapsed time: 965.8611s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 10:52:12 Train Epoch: 7 [153600/494075 (31%)]6084, Loss: 10.014940, Elapsed time: 919.8149s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 11:08:37 Train Epoch: 7 [204800/494075 (41%)]6184, Loss: 9.994566, Elapsed time: 984.4306s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 11:24:41 Train Epoch: 7 [256000/494075 (52%)]6284, Loss: 9.898741, Elapsed time: 964.4229s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 11:38:38 Train Epoch: 7 [307200/494075 (62%)]6384, Loss: 9.930015, Elapsed time: 836.4562s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 11:54:43 Train Epoch: 7 [358400/494075 (73%)]6484, Loss: 9.934775, Elapsed time: 965.0348s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 12:10:28 Train Epoch: 7 [409600/494075 (83%)]6584, Loss: 9.877912, Elapsed time: 944.4764s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 12:26:57 Train Epoch: 7 [460800/494075 (93%)]6684, Loss: 9.746922, Elapsed time: 989.0695s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9653 std=0.0101 thd=0.2690 2018-07-01 12:47:46 Epoch 8 start training 2018-07-01 13:02:47 Train Epoch: 8 [51200/494075 (10%)]6848, Loss: 9.081228, Elapsed time: 900.8224s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 13:16:46 Train Epoch: 8 [102400/494075 (21%)]6948, Loss: 9.326673, Elapsed time: 838.0815s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 13:32:13 Train Epoch: 8 [153600/494075 (31%)]7048, Loss: 9.311176, Elapsed time: 927.3180s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 13:48:21 Train Epoch: 8 [204800/494075 (41%)]7148, Loss: 9.385469, Elapsed time: 967.4669s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 14:04:32 Train Epoch: 8 [256000/494075 (52%)]7248, Loss: 9.390140, Elapsed time: 971.0669s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 14:18:31 Train Epoch: 8 [307200/494075 (62%)]7348, Loss: 9.368503, Elapsed time: 838.6101s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 14:34:20 Train Epoch: 8 [358400/494075 (73%)]7448, Loss: 9.359023, Elapsed time: 949.2509s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 14:50:36 Train Epoch: 8 [409600/494075 (83%)]7548, Loss: 9.389216, Elapsed time: 975.3230s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 15:07:35 Train Epoch: 8 [460800/494075 (93%)]7648, Loss: 9.343820, Elapsed time: 1018.9497s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9662 std=0.0098 thd=0.2445 2018-07-01 15:29:08 Epoch 9 start training 2018-07-01 15:45:35 Train Epoch: 9 [51200/494075 (10%)]7812, Loss: 8.567565, Elapsed time: 986.9376s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 15:59:48 Train Epoch: 9 [102400/494075 (21%)]7912, Loss: 8.724940, Elapsed time: 851.7808s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 16:15:06 Train Epoch: 9 [153600/494075 (31%)]8012, Loss: 8.895157, Elapsed time: 918.3873s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 16:31:26 Train Epoch: 9 [204800/494075 (41%)]8112, Loss: 8.949307, Elapsed time: 979.6254s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 16:46:36 Train Epoch: 9 [256000/494075 (52%)]8212, Loss: 8.907965, Elapsed time: 909.5387s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 17:02:25 Train Epoch: 9 [307200/494075 (62%)]8312, Loss: 8.926835, Elapsed time: 949.0584s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 17:17:29 Train Epoch: 9 [358400/494075 (73%)]8412, Loss: 8.960207, Elapsed time: 903.0256s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 17:32:46 Train Epoch: 9 [409600/494075 (83%)]8512, Loss: 8.904014, Elapsed time: 915.9818s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 17:49:57 Train Epoch: 9 [460800/494075 (93%)]8612, Loss: 8.936779, Elapsed time: 1030.9504s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9708 std=0.0095 thd=0.2310 2018-07-01 18:12:08 Epoch 10 start training 2018-07-01 18:29:41 Train Epoch: 10 [51200/494075 (10%)]8776, Loss: 8.178925, Elapsed time: 1052.9247s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 18:46:52 Train Epoch: 10 [102400/494075 (21%)]8876, Loss: 8.372001, Elapsed time: 1030.1433s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 19:03:51 Train Epoch: 10 [153600/494075 (31%)]8976, Loss: 8.447295, Elapsed time: 1018.7434s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 19:19:03 Train Epoch: 10 [204800/494075 (41%)]9076, Loss: 8.551106, Elapsed time: 912.4707s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 19:37:10 Train Epoch: 10 [256000/494075 (52%)]9176, Loss: 8.580898, Elapsed time: 1086.7418s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 19:54:12 Train Epoch: 10 [307200/494075 (62%)]9276, Loss: 8.594782, Elapsed time: 1021.6986s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 20:11:26 Train Epoch: 10 [358400/494075 (73%)]9376, Loss: 8.567825, Elapsed time: 1034.3158s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 20:28:49 Train Epoch: 10 [409600/494075 (83%)]9476, Loss: 8.566474, Elapsed time: 1041.5235s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 20:46:33 Train Epoch: 10 [460800/494075 (93%)]9576, Loss: 8.592611, Elapsed time: 1063.7387s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9735 std=0.0070 thd=0.2270 2018-07-01 21:08:38 Epoch 11 start training 2018-07-01 21:25:56 Train Epoch: 11 [51200/494075 (10%)]9740, Loss: 7.830455, Elapsed time: 1037.2709s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 21:41:41 Train Epoch: 11 [102400/494075 (21%)]9840, Loss: 8.027585, Elapsed time: 944.9043s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 21:58:55 Train Epoch: 11 [153600/494075 (31%)]9940, Loss: 8.136673, Elapsed time: 1033.5133s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 22:17:49 Train Epoch: 11 [204800/494075 (41%)]10040, Loss: 8.273895, Elapsed time: 1134.2792s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 22:36:13 Train Epoch: 11 [256000/494075 (52%)]10140, Loss: 8.225353, Elapsed time: 1103.2039s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 22:55:20 Train Epoch: 11 [307200/494075 (62%)]10240, Loss: 8.237213, Elapsed time: 1145.9590s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 23:14:25 Train Epoch: 11 [358400/494075 (73%)]10340, Loss: 8.303557, Elapsed time: 1144.5931s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 23:33:03 Train Epoch: 11 [409600/494075 (83%)]10440, Loss: 8.253798, Elapsed time: 1117.2290s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-01 23:52:07 Train Epoch: 11 [460800/494075 (93%)]10540, Loss: 8.318643, Elapsed time: 1143.8979s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9743 std=0.0075 thd=0.2050 2018-07-02 00:16:36 Epoch 12 start training 2018-07-02 00:34:42 Train Epoch: 12 [51200/494075 (10%)]10704, Loss: 7.527889, Elapsed time: 1086.0708s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 00:54:04 Train Epoch: 12 [102400/494075 (21%)]10804, Loss: 7.730953, Elapsed time: 1160.6681s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 01:12:33 Train Epoch: 12 [153600/494075 (31%)]10904, Loss: 7.847623, Elapsed time: 1109.6967s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 01:31:26 Train Epoch: 12 [204800/494075 (41%)]11004, Loss: 7.938047, Elapsed time: 1131.5936s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 01:49:34 Train Epoch: 12 [256000/494075 (52%)]11104, Loss: 7.979567, Elapsed time: 1086.8177s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 02:08:18 Train Epoch: 12 [307200/494075 (62%)]11204, Loss: 8.002248, Elapsed time: 1123.0936s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 02:27:06 Train Epoch: 12 [358400/494075 (73%)]11304, Loss: 8.064400, Elapsed time: 1127.9610s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 02:46:14 Train Epoch: 12 [409600/494075 (83%)]11404, Loss: 8.075332, Elapsed time: 1147.8987s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 03:04:45 Train Epoch: 12 [460800/494075 (93%)]11504, Loss: 8.059790, Elapsed time: 1110.5922s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9732 std=0.0079 thd=0.2185 2018-07-02 03:28:29 Epoch 13 start training 2018-07-02 03:46:35 Train Epoch: 13 [51200/494075 (10%)]11668, Loss: 7.270930, Elapsed time: 1085.1886s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 04:05:54 Train Epoch: 13 [102400/494075 (21%)]11768, Loss: 7.505592, Elapsed time: 1159.1976s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 04:24:24 Train Epoch: 13 [153600/494075 (31%)]11868, Loss: 7.601075, Elapsed time: 1109.9218s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 04:42:52 Train Epoch: 13 [204800/494075 (41%)]11968, Loss: 7.742923, Elapsed time: 1107.5778s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 04:59:27 Train Epoch: 13 [256000/494075 (52%)]12068, Loss: 7.748645, Elapsed time: 994.4075s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 05:17:52 Train Epoch: 13 [307200/494075 (62%)]12168, Loss: 7.763968, Elapsed time: 1104.1122s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 05:36:41 Train Epoch: 13 [358400/494075 (73%)]12268, Loss: 7.825192, Elapsed time: 1129.0836s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 05:55:15 Train Epoch: 13 [409600/494075 (83%)]12368, Loss: 7.882340, Elapsed time: 1114.2422s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 06:12:42 Train Epoch: 13 [460800/494075 (93%)]12468, Loss: 7.871018, Elapsed time: 1046.3548s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9720 std=0.0086 thd=0.1910 2018-07-02 06:35:38 Epoch 14 start training 2018-07-02 06:52:51 Train Epoch: 14 [51200/494075 (10%)]12632, Loss: 7.051886, Elapsed time: 1033.3436s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 07:11:07 Train Epoch: 14 [102400/494075 (21%)]12732, Loss: 7.240369, Elapsed time: 1095.5520s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 07:29:11 Train Epoch: 14 [153600/494075 (31%)]12832, Loss: 7.384147, Elapsed time: 1083.4188s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 07:46:31 Train Epoch: 14 [204800/494075 (41%)]12932, Loss: 7.579564, Elapsed time: 1039.6800s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 08:04:11 Train Epoch: 14 [256000/494075 (52%)]13032, Loss: 7.621207, Elapsed time: 1060.4075s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 08:19:08 Train Epoch: 14 [307200/494075 (62%)]13132, Loss: 7.586890, Elapsed time: 896.0602s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 08:36:47 Train Epoch: 14 [358400/494075 (73%)]13232, Loss: 7.670662, Elapsed time: 1059.4701s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 08:54:46 Train Epoch: 14 [409600/494075 (83%)]13332, Loss: 7.681334, Elapsed time: 1078.5246s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 09:12:21 Train Epoch: 14 [460800/494075 (93%)]13432, Loss: 7.697121, Elapsed time: 1054.9311s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9740 std=0.0073 thd=0.1890 2018-07-02 09:36:18 Epoch 15 start training 2018-07-02 09:53:55 Train Epoch: 15 [51200/494075 (10%)]13596, Loss: 6.852710, Elapsed time: 1056.0046s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 10:10:39 Train Epoch: 15 [102400/494075 (21%)]13696, Loss: 7.121226, Elapsed time: 1003.6537s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 10:28:42 Train Epoch: 15 [153600/494075 (31%)]13796, Loss: 7.302406, Elapsed time: 1082.8022s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 10:46:49 Train Epoch: 15 [204800/494075 (41%)]13896, Loss: 7.309793, Elapsed time: 1086.6013s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 11:04:50 Train Epoch: 15 [256000/494075 (52%)]13996, Loss: 7.400906, Elapsed time: 1080.9062s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 11:22:30 Train Epoch: 15 [307200/494075 (62%)]14096, Loss: 7.461170, Elapsed time: 1058.9820s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 11:39:56 Train Epoch: 15 [358400/494075 (73%)]14196, Loss: 7.475625, Elapsed time: 1045.6317s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 12:01:16 Train Epoch: 15 [409600/494075 (83%)]14296, Loss: 7.496396, Elapsed time: 1279.5065s(100 iters) Margin: 0.4000, Scale: 30.00 2018-07-02 12:19:04 Train Epoch: 15 [460800/494075 (93%)]14396, Loss: 7.531948, Elapsed time: 1068.8810s(100 iters) Margin: 0.4000, Scale: 30.00 LFWACC=0.9715 std=0.0090 thd=0.1920 2018-07-02 12:43:21 Epoch 16 start training
@felixfuu @MuggleWang There is also my python version 3.6.
I didn't see the training status after you adjusted the learning rate. In addition, I recommend preprocessing in full accordance with this stream. Because your data category and data volume are different from what I use. It seems that your training state is not very good at the beginning.
@MuggleWang How do I look at the "log" you provide, the time I spend using multiple GPUs is much better than yours. I will refer to the data preprocessing method you provided, try again, thank you!
Perhaps the impact of GPU performance. Ask your boss for a better GPU :-)
@MuggleWang ^_^, I will try my best. Thank you very much for your work!
@MuggleWang Also want to ask your python and GPU configuration? For my Elapsed time: 1033.5133s (100 iters) is so long, and you only have tens of seconds. I don't konw why? Is it really my GPU is too garbage?
I used 4 [GeForce GTX TITAN X]
i used 1 GPU(K40)
@MuggleWang @felixfuu Hi,I think I have found the problem, it should be the data problem, thank you very much! best wishes
hi, guys, I have the same question. I got 97.08% accuracy on lfw dataset (just w/o 'RandomHorizontalFlip' argument during the training on the preprocessed casia-webface dataset). Is this result reasonable?
@MuggleWang @Hedlen "D:\Program Files\Anaconda\envs\pytorch\python.exe" F:/Python_code/CosFace/CosFace_pytorch-master/main.py
Namespace(batch_size=512, classifier_type='MCP', cuda=False, database='LFW', epochs=30, is_gray=False, log_interval=100, lr=0.1, momentum=0.9, network='sphere20', no_cuda=False, num_class=10752, root_path='', save_path='checkpoint/', step_size=[16000, 24000], train_list='dataset/cleaned_list.txt', weight_decay=0.0005, workers=4)
DataParallel(
(module): sphere(
(layer1): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(1): PReLU(num_parameters=64)
(2): Block(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=64)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=64)
)
)
(layer2): Sequential(
(0): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(1): PReLU(num_parameters=128)
(2): Block(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=128)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=128)
)
(3): Block(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=128)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=128)
)
)
(layer3): Sequential(
(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(1): PReLU(num_parameters=256)
(2): Block(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=256)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=256)
)
(3): Block(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=256)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=256)
)
(4): Block(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=256)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=256)
)
(5): Block(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=256)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=256)
)
)
(layer4): Sequential(
(0): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(1): PReLU(num_parameters=512)
(2): Block(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu1): PReLU(num_parameters=512)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(prelu2): PReLU(num_parameters=512)
)
)
(fc): Linear(in_features=21504, out_features=512, bias=True)
)
)
length of train Database: 455594
Number of Identities: 10752
2020-08-26 08:57:13 Epoch 1 start training
Cannot load image 0797464\115.jpg
Cannot load image 1775404\003.jpg
Cannot load image 1259832\027.jpg
Cannot load image 0663469\063.jpg
Cannot load image 0001082\100.jpg
Cannot load image 1635108\036.jpg
Cannot load image 1975142\007.jpg
Cannot load image 0445903\002.jpg
Cannot load image 0037118\013.jpg
Traceback (most recent call last):
File "F:/Python_code/CosFace/CosFace_pytorch-master/main.py", line 248, in
Process finished with exit code 1
hello, I want to ask how to solve this problem?and can you upload the relevant image data and txt file to me?
I'm very confused that I used the same code but I can't get the same result. My pytorch version is 0.4. Can you share your training tricks? best wishes!