Closed Viozer closed 4 years ago
Hi @Viozer, thanks for your interest.
I think you are using the wrong batch size which would affect Batch Norm. Right now you are using 5 GPUs by specifying --device 0 1 2 3 4
, which gives 64/5 samples per GPU worker during forward pass. This is not recommended, and I would encourage using at least 16 samples per worker during forward pass for keeping Batch Norm stable (see also https://arxiv.org/pdf/1803.08494.pdf); so that would be --device 0 1 2 3 --batch-size 64 --forward-batch-size 64 --base-lr 0.1
as specified in the README.md.
Hi @Viozer, thanks for your interest.
I think you are using the wrong batch size which would affect Batch Norm. Right now you are using 5 GPUs by specifying
--device 0 1 2 3 4
, which gives 64/5 samples per GPU worker during forward pass. This is not recommended, and I would encourage using at least 16 samples per worker during forward pass for keeping Batch Norm stable (see also https://arxiv.org/pdf/1803.08494.pdf); so that would be--device 0 1 2 3 --batch-size 64 --forward-batch-size 64 --base-lr 0.1
as specified in the README.md.
Oh, I'm sorry. I mistyped the device argments. I use --device 0 1 2 3
actually. But I still can not reproduce the result.
Apart from the batch size, your training command looks fine. It would definitely help if you post all training configurations (top of the generated log file for each run). Meanwhile, I would suggest running some checks on your data as well.
Apart from the batch size, your training command looks fine. It would definitely help if you post all training configurations (top of the generated log file for each run). Meanwhile, I would suggest running some checks on your data as well.
Thanks for your suggestion. This is my training configurations. Is there some mistakes?
[ Mon Jul 13 20:47:35 2020 ] Model total number of params: 3217695 [ Mon Jul 13 20:47:35 2020 ] Training epoch: 1, LR: 0.1000 [ Mon Jul 13 20:49:02 2020 ] Model total number of params: 3217695 [ Mon Jul 13 20:49:02 2020 ] *** [ Mon Jul 13 20:49:02 2020 ] * Using Half Precision Training * [ Mon Jul 13 20:49:02 2020 ] *** [ Mon Jul 13 20:49:03 2020 ] 4 GPUs available, using DataParallel [ Mon Jul 13 20:49:03 2020 ] Parameters: {'amp_opt_level': 1, 'assume_yes': False, 'base_lr': 0.1, 'batch_size': 64, 'checkpoint': None, 'config': 'config/nturgbd120-cross-subject/train_joint.yaml', 'debug': False, 'device': [0, 1, 2, 3], 'eval_interval': 1, 'eval_start': 1, 'feeder': 'feeders.feeder.Feeder', 'forward_batch_size': 64, 'half': True, 'ignore_weights': [], 'log_interval': 100, 'model': 'model.msg3d.Model', 'model_args': {'graph': 'graph.ntu_rgb_d.AdjMatrixGraph', 'num_class': 120, 'num_g3d_scales': 6, 'num_gcn_scales': 13, 'num_person': 2, 'num_point': 25}, 'model_saved_name': '', 'nesterov': True, 'num_epoch': 60, 'num_worker': 32, 'optimizer': 'SGD', 'optimizer_states': None, 'phase': 'train', 'print_log': True, 'save_interval': 1, 'save_score': False, 'seed': 153, 'show_topk': [1, 5], 'start_epoch': 0, 'step': [30, 50], 'test_batch_size': 32, 'test_feeder_args': {'data_path': './data/ntu120/xsub/val_data_joint.npy', 'label_path': './data/ntu120/xsub/val_label.pkl'}, 'train_feeder_args': {'data_path': './data/ntu120/xsub/train_data_joint.npy', 'debug': False, 'label_path': './data/ntu120/xsub/train_label.pkl', 'normalization': False, 'random_choose': False, 'random_move': False, 'random_shift': False, 'window_size': -1}, 'weight_decay': 0.0005, 'weights': None, 'work_dir': 'work_dir/ntu120/msg3d_bs64/cs'}
Hi @Viozer, the training settings seem fine. I just want to give a kind reminder that the results on NTU RGB+D 120 (Table 1 in the paper) are from a joint-bone two-stream ensemble, and I have about 83% on the joint stream and 85.6% on the bone stream to give the final 86.9% on NTU 120 X-Sub (and on X-Set I have 84.4% Joint + 87.3% Bone = 88.4% Ensemble). I would say getting 82.22% on the joint stream seems like a possible fluctuation.
I would suggest trying out the bone stream and get the final ensemble result first and see how it compares to the reported number. Apart from the above, maybe try re-running the joint-stream and see you could get better performance. However, if 82.22% is the joint-bone fusion result on NTU 120, then there might be something terribly wrong :)
Thanks!The result I got is on joint stream only
464632545
邮箱:464632545@qq.com
Signature is customized by Netease Mail Master
On 07/16/2020 13:16, kenziyuliu wrote: Hi @Viozer, the training settings seem fine. I just want to give a kind reminder that the results on NTU RGB+D 120 (Table 1 in the paper) are from a joint-bone two-stream ensemble, and I have about 83% on the joint stream and 85.6% on the bone stream to give the final 86.9% on NTU 120 X-Sub (on X-Set I have 84.4% Joint + 87.3% Bone = 88.4% Ensemble). I would say getting 82.22% on the joint stream seems like a possible fluctuation. I would suggest trying out the bone stream and get the final ensemble result first and see how it compares to the reported number. Apart from the above, maybe try re-running the joint-stream and see you could get better performance. However, if 82.22% is the joint-bone fusion result on NTU 120, then there might be something terribly wrong :)
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe. [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/kenziyuliu/MS-G3D/issues/14#issuecomment-659164178", "url": "https://github.com/kenziyuliu/MS-G3D/issues/14#issuecomment-659164178", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]
@Viozer hi, Viozer, I got 82.37% on NTU120 X-sub, joint stream. Can you reproduce the result now ?
hi, the accuracy on NTU120 xsub reported in paper is the ensemble result of joint stream and bone stream. I think it’s normal to get 82.37% with joint stream alone.
@Viozer after fusion, my result is 86.10%,can you get 86.90%?
@Viozer after fusion, my result is 86.10%,can you get 86.90%?
I'm sorry. I haven't test the ensemble result. You may ask the authors for help.
@Viozer thanks for your reply~
Hi @qiexing,
Not sure if this can help you, but the following is my training log for NTU 120 X-Sub, and you can compare the training settings. IIRC this is the log of the run that produced our pretrained model, but note that the codebase was changed after that before release (e.g. I added --amp-opt-level
afterwards, and this uses O1).
[ Thu Oct 17 09:11:51 2019 ] Model total number of params: 3217695 [ Thu Oct 17 09:11:51 2019 ] Using Half Precision Training [ Thu Oct 17 09:11:51 2019 ] 4 GPUs available, using DataParallel [ Thu Oct 17 09:11:51 2019 ] Parameters: {'assume_yes': False, 'base_lr': 0.1, 'batch_size': 64, 'config': './config/nturgbd120-cross-subject/train_joint.yaml', 'debug': False, 'device': [0, 1, 2, 3], 'eval_interval': 1, 'feeder': 'feeders.feeder.Feeder', 'forward_batch_size': 64, 'half': True, 'ignore_weights': [], 'log_interval': 100, 'model': 'model.agcn.Model', 'model_args': {'graph': 'graph.ntu_rgb_d.AdjMatrixGraph', 'num_class': 120, 'num_person': 2, 'num_point': 25}, 'model_saved_name': './runs/90-85b-ntu120-xsub', 'nesterov': True, 'num_epoch': 60, 'num_worker': 16, 'optimizer': 'SGD', 'optimizer_states': None, 'phase': 'train', 'print_log': True, 'save_interval': 1, 'save_score': False, 'seed': 90, 'show_topk': [1, 5], 'start_epoch': 0, 'step': [30, 50], 'test_batch_size': 128, 'test_feeder_args': {'data_path': './data/ntu120/xsub/val_data_joint.npy', 'label_path': './data/ntu120/xsub/val_label.pkl'}, 'train_feeder_args': {'data_path': './data/ntu120/xsub/train_data_joint.npy', 'debug': False, 'label_path': './data/ntu120/xsub/train_label.pkl', 'normalization': False, 'random_choose': False, 'random_move': False, 'random_shift': False, 'window_size': -1}, 'weight_decay': 0.0005, 'weights': None, 'work_dir': '90-85b-ntu120-xsub'}
[ Thu Oct 17 09:11:51 2019 ] Model total number of params: 3217695 [ Thu Oct 17 09:11:51 2019 ] Training epoch: 1, LR: 0.1000 [ Thu Oct 17 09:33:04 2019 ] Mean training loss: 3.2698 (BS 32: 3.2698). [ Thu Oct 17 09:33:04 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 09:33:04 2019 ] Eval epoch: 1 [ Thu Oct 17 09:36:27 2019 ] Mean test loss of 398 batches: 2.5316004121123847. [ Thu Oct 17 09:36:28 2019 ] Top1: 32.72% [ Thu Oct 17 09:36:28 2019 ] Top5: 67.80% [ Thu Oct 17 09:36:28 2019 ] Training epoch: 2, LR: 0.1000 [ Thu Oct 17 09:57:31 2019 ] Mean training loss: 1.9523 (BS 32: 1.9523). [ Thu Oct 17 09:57:31 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 09:57:31 2019 ] Eval epoch: 2 [ Thu Oct 17 10:00:48 2019 ] Mean test loss of 398 batches: 1.931113257779548. [ Thu Oct 17 10:00:49 2019 ] Top1: 46.39% [ Thu Oct 17 10:00:49 2019 ] Top5: 78.46% [ Thu Oct 17 10:00:49 2019 ] Training epoch: 3, LR: 0.1000 [ Thu Oct 17 10:21:51 2019 ] Mean training loss: 1.5125 (BS 32: 1.5125). [ Thu Oct 17 10:21:51 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 10:21:51 2019 ] Eval epoch: 3 [ Thu Oct 17 10:25:07 2019 ] Mean test loss of 398 batches: 1.6146803146331155. [ Thu Oct 17 10:25:07 2019 ] Top1: 53.84% [ Thu Oct 17 10:25:08 2019 ] Top5: 83.41% [ Thu Oct 17 10:25:08 2019 ] Training epoch: 4, LR: 0.1000 [ Thu Oct 17 10:46:07 2019 ] Mean training loss: 1.2792 (BS 32: 1.2792). [ Thu Oct 17 10:46:07 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 10:46:07 2019 ] Eval epoch: 4 [ Thu Oct 17 10:49:23 2019 ] Mean test loss of 398 batches: 1.3674843446094187. [ Thu Oct 17 10:49:23 2019 ] Top1: 60.06% [ Thu Oct 17 10:49:24 2019 ] Top5: 88.03% [ Thu Oct 17 10:49:24 2019 ] Training epoch: 5, LR: 0.1000 [ Thu Oct 17 11:10:24 2019 ] Mean training loss: 1.1524 (BS 32: 1.1524). [ Thu Oct 17 11:10:24 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 11:10:24 2019 ] Eval epoch: 5 [ Thu Oct 17 11:13:39 2019 ] Mean test loss of 398 batches: 1.4367618885771114. [ Thu Oct 17 11:13:40 2019 ] Top1: 58.66% [ Thu Oct 17 11:13:40 2019 ] Top5: 86.71% [ Thu Oct 17 11:13:40 2019 ] Training epoch: 6, LR: 0.1000 [ Thu Oct 17 11:34:39 2019 ] Mean training loss: 1.0723 (BS 32: 1.0723). [ Thu Oct 17 11:34:39 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 11:34:39 2019 ] Eval epoch: 6 [ Thu Oct 17 11:37:55 2019 ] Mean test loss of 398 batches: 1.3426766256291662. [ Thu Oct 17 11:37:55 2019 ] Top1: 62.10% [ Thu Oct 17 11:37:56 2019 ] Top5: 88.76% [ Thu Oct 17 11:37:56 2019 ] Training epoch: 7, LR: 0.1000 [ Thu Oct 17 11:58:56 2019 ] Mean training loss: 1.0105 (BS 32: 1.0105). [ Thu Oct 17 11:58:56 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 11:58:56 2019 ] Eval epoch: 7 [ Thu Oct 17 12:02:12 2019 ] Mean test loss of 398 batches: 1.2452840423164655. [ Thu Oct 17 12:02:12 2019 ] Top1: 64.56% [ Thu Oct 17 12:02:12 2019 ] Top5: 89.63% [ Thu Oct 17 12:02:12 2019 ] Training epoch: 8, LR: 0.1000 [ Thu Oct 17 12:23:12 2019 ] Mean training loss: 0.9664 (BS 32: 0.9664). [ Thu Oct 17 12:23:12 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 12:23:13 2019 ] Eval epoch: 8 [ Thu Oct 17 12:26:28 2019 ] Mean test loss of 398 batches: 1.0543443608523613. [ Thu Oct 17 12:26:28 2019 ] Top1: 69.41% [ Thu Oct 17 12:26:29 2019 ] Top5: 91.71% [ Thu Oct 17 12:26:29 2019 ] Training epoch: 9, LR: 0.1000 [ Thu Oct 17 12:47:29 2019 ] Mean training loss: 0.9308 (BS 32: 0.9308). [ Thu Oct 17 12:47:29 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 12:47:29 2019 ] Eval epoch: 9 [ Thu Oct 17 12:50:45 2019 ] Mean test loss of 398 batches: 1.1722858247145935. [ Thu Oct 17 12:50:46 2019 ] Top1: 66.36% [ Thu Oct 17 12:50:47 2019 ] Top5: 90.37% [ Thu Oct 17 12:50:47 2019 ] Training epoch: 10, LR: 0.1000 [ Thu Oct 17 13:11:47 2019 ] Mean training loss: 0.9053 (BS 32: 0.9053). [ Thu Oct 17 13:11:47 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 13:11:47 2019 ] Eval epoch: 10 [ Thu Oct 17 13:15:02 2019 ] Mean test loss of 398 batches: 1.3315673215904427. [ Thu Oct 17 13:15:03 2019 ] Top1: 63.23% [ Thu Oct 17 13:15:03 2019 ] Top5: 89.49% [ Thu Oct 17 13:15:03 2019 ] Training epoch: 11, LR: 0.1000 [ Thu Oct 17 13:36:04 2019 ] Mean training loss: 0.8861 (BS 32: 0.8861). [ Thu Oct 17 13:36:04 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 13:36:04 2019 ] Eval epoch: 11 [ Thu Oct 17 13:39:19 2019 ] Mean test loss of 398 batches: 1.5431065204455026. [ Thu Oct 17 13:39:20 2019 ] Top1: 59.46% [ Thu Oct 17 13:39:20 2019 ] Top5: 85.29% [ Thu Oct 17 13:39:20 2019 ] Training epoch: 12, LR: 0.1000 [ Thu Oct 17 14:00:21 2019 ] Mean training loss: 0.8647 (BS 32: 0.8647). [ Thu Oct 17 14:00:21 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 14:00:21 2019 ] Eval epoch: 12 [ Thu Oct 17 14:03:36 2019 ] Mean test loss of 398 batches: 1.6950628749988785. [ Thu Oct 17 14:03:36 2019 ] Top1: 60.14% [ Thu Oct 17 14:03:37 2019 ] Top5: 83.38% [ Thu Oct 17 14:03:37 2019 ] Training epoch: 13, LR: 0.1000 [ Thu Oct 17 14:24:37 2019 ] Mean training loss: 0.8546 (BS 32: 0.8546). [ Thu Oct 17 14:24:37 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 14:24:37 2019 ] Eval epoch: 13 [ Thu Oct 17 14:27:53 2019 ] Mean test loss of 398 batches: 1.1902299640915501. [ Thu Oct 17 14:27:53 2019 ] Top1: 65.97% [ Thu Oct 17 14:27:53 2019 ] Top5: 90.06% [ Thu Oct 17 14:27:53 2019 ] Training epoch: 14, LR: 0.1000 [ Thu Oct 17 14:48:53 2019 ] Mean training loss: 0.8351 (BS 32: 0.8351). [ Thu Oct 17 14:48:53 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 14:48:53 2019 ] Eval epoch: 14 [ Thu Oct 17 14:52:09 2019 ] Mean test loss of 398 batches: 1.1338547977370832. [ Thu Oct 17 14:52:09 2019 ] Top1: 68.03% [ Thu Oct 17 14:52:09 2019 ] Top5: 90.78% [ Thu Oct 17 14:52:09 2019 ] Training epoch: 15, LR: 0.1000 [ Thu Oct 17 15:13:09 2019 ] Mean training loss: 0.8155 (BS 32: 0.8155). [ Thu Oct 17 15:13:09 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 15:13:10 2019 ] Eval epoch: 15 [ Thu Oct 17 15:16:25 2019 ] Mean test loss of 398 batches: 1.2515172279959348. [ Thu Oct 17 15:16:25 2019 ] Top1: 64.99% [ Thu Oct 17 15:16:26 2019 ] Top5: 88.77% [ Thu Oct 17 15:16:26 2019 ] Training epoch: 16, LR: 0.1000 [ Thu Oct 17 15:37:26 2019 ] Mean training loss: 0.8065 (BS 32: 0.8065). [ Thu Oct 17 15:37:26 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 15:37:26 2019 ] Eval epoch: 16 [ Thu Oct 17 15:40:41 2019 ] Mean test loss of 398 batches: 1.1653472116245098. [ Thu Oct 17 15:40:42 2019 ] Top1: 67.53% [ Thu Oct 17 15:40:42 2019 ] Top5: 90.72% [ Thu Oct 17 15:40:42 2019 ] Training epoch: 17, LR: 0.1000 [ Thu Oct 17 16:01:40 2019 ] Mean training loss: 0.7954 (BS 32: 0.7954). [ Thu Oct 17 16:01:40 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 16:01:41 2019 ] Eval epoch: 17 [ Thu Oct 17 16:04:56 2019 ] Mean test loss of 398 batches: 1.1456992401549564. [ Thu Oct 17 16:04:56 2019 ] Top1: 67.68% [ Thu Oct 17 16:04:57 2019 ] Top5: 90.87% [ Thu Oct 17 16:04:57 2019 ] Training epoch: 18, LR: 0.1000 [ Thu Oct 17 16:25:56 2019 ] Mean training loss: 0.7886 (BS 32: 0.7886). [ Thu Oct 17 16:25:56 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 16:25:56 2019 ] Eval epoch: 18 [ Thu Oct 17 16:29:11 2019 ] Mean test loss of 398 batches: 1.3667717808155557. [ Thu Oct 17 16:29:11 2019 ] Top1: 64.85% [ Thu Oct 17 16:29:12 2019 ] Top5: 88.10% [ Thu Oct 17 16:29:12 2019 ] Training epoch: 19, LR: 0.1000 [ Thu Oct 17 16:50:11 2019 ] Mean training loss: 0.7821 (BS 32: 0.7821). [ Thu Oct 17 16:50:11 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 16:50:11 2019 ] Eval epoch: 19 [ Thu Oct 17 16:53:27 2019 ] Mean test loss of 398 batches: 1.0189512773374816. [ Thu Oct 17 16:53:27 2019 ] Top1: 70.57% [ Thu Oct 17 16:53:27 2019 ] Top5: 92.09% [ Thu Oct 17 16:53:27 2019 ] Training epoch: 20, LR: 0.1000 [ Thu Oct 17 17:14:26 2019 ] Mean training loss: 0.7757 (BS 32: 0.7757). [ Thu Oct 17 17:14:26 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 17:14:26 2019 ] Eval epoch: 20 [ Thu Oct 17 17:17:41 2019 ] Mean test loss of 398 batches: 1.093704903545092. [ Thu Oct 17 17:17:41 2019 ] Top1: 69.35% [ Thu Oct 17 17:17:42 2019 ] Top5: 90.81% [ Thu Oct 17 17:17:42 2019 ] Training epoch: 21, LR: 0.1000 [ Thu Oct 17 17:38:40 2019 ] Mean training loss: 0.7621 (BS 32: 0.7621). [ Thu Oct 17 17:38:40 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 17:38:41 2019 ] Eval epoch: 21 [ Thu Oct 17 17:41:57 2019 ] Mean test loss of 398 batches: 1.268926610898732. [ Thu Oct 17 17:41:57 2019 ] Top1: 64.81% [ Thu Oct 17 17:41:58 2019 ] Top5: 89.59% [ Thu Oct 17 17:41:58 2019 ] Training epoch: 22, LR: 0.1000 [ Thu Oct 17 18:02:57 2019 ] Mean training loss: 0.7612 (BS 32: 0.7612). [ Thu Oct 17 18:02:57 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 18:02:57 2019 ] Eval epoch: 22 [ Thu Oct 17 18:06:13 2019 ] Mean test loss of 398 batches: 1.0891967403828797. [ Thu Oct 17 18:06:13 2019 ] Top1: 69.27% [ Thu Oct 17 18:06:14 2019 ] Top5: 91.22% [ Thu Oct 17 18:06:14 2019 ] Training epoch: 23, LR: 0.1000 [ Thu Oct 17 18:27:12 2019 ] Mean training loss: 0.7544 (BS 32: 0.7544). [ Thu Oct 17 18:27:12 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 18:27:12 2019 ] Eval epoch: 23 [ Thu Oct 17 18:30:27 2019 ] Mean test loss of 398 batches: 1.0890155813502307. [ Thu Oct 17 18:30:28 2019 ] Top1: 68.71% [ Thu Oct 17 18:30:28 2019 ] Top5: 91.64% [ Thu Oct 17 18:30:28 2019 ] Training epoch: 24, LR: 0.1000 [ Thu Oct 17 18:51:26 2019 ] Mean training loss: 0.7493 (BS 32: 0.7493). [ Thu Oct 17 18:51:26 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 18:51:26 2019 ] Eval epoch: 24 [ Thu Oct 17 18:54:42 2019 ] Mean test loss of 398 batches: 1.1177235975037867. [ Thu Oct 17 18:54:42 2019 ] Top1: 68.24% [ Thu Oct 17 18:54:42 2019 ] Top5: 90.83% [ Thu Oct 17 18:54:42 2019 ] Training epoch: 25, LR: 0.1000 [ Thu Oct 17 19:15:40 2019 ] Mean training loss: 0.7446 (BS 32: 0.7446). [ Thu Oct 17 19:15:40 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 19:15:41 2019 ] Eval epoch: 25 [ Thu Oct 17 19:18:56 2019 ] Mean test loss of 398 batches: 1.0961671795078258. [ Thu Oct 17 19:18:56 2019 ] Top1: 69.15% [ Thu Oct 17 19:18:57 2019 ] Top5: 91.69% [ Thu Oct 17 19:18:57 2019 ] Training epoch: 26, LR: 0.1000 [ Thu Oct 17 19:39:54 2019 ] Mean training loss: 0.7408 (BS 32: 0.7408). [ Thu Oct 17 19:39:54 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 19:39:54 2019 ] Eval epoch: 26 [ Thu Oct 17 19:43:10 2019 ] Mean test loss of 398 batches: 0.9536735655075341. [ Thu Oct 17 19:43:10 2019 ] Top1: 72.01% [ Thu Oct 17 19:43:11 2019 ] Top5: 92.94% [ Thu Oct 17 19:43:11 2019 ] Training epoch: 27, LR: 0.1000 [ Thu Oct 17 20:04:09 2019 ] Mean training loss: 0.7362 (BS 32: 0.7362). [ Thu Oct 17 20:04:09 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 20:04:09 2019 ] Eval epoch: 27 [ Thu Oct 17 20:07:24 2019 ] Mean test loss of 398 batches: 0.9769743888372153. [ Thu Oct 17 20:07:24 2019 ] Top1: 71.85% [ Thu Oct 17 20:07:25 2019 ] Top5: 92.42% [ Thu Oct 17 20:07:25 2019 ] Training epoch: 28, LR: 0.1000 [ Thu Oct 17 20:28:22 2019 ] Mean training loss: 0.7335 (BS 32: 0.7335). [ Thu Oct 17 20:28:22 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 20:28:22 2019 ] Eval epoch: 28 [ Thu Oct 17 20:31:37 2019 ] Mean test loss of 398 batches: 1.0499065840364101. [ Thu Oct 17 20:31:38 2019 ] Top1: 69.17% [ Thu Oct 17 20:31:38 2019 ] Top5: 91.88% [ Thu Oct 17 20:31:38 2019 ] Training epoch: 29, LR: 0.1000 [ Thu Oct 17 20:52:35 2019 ] Mean training loss: 0.7300 (BS 32: 0.7300). [ Thu Oct 17 20:52:35 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 20:52:35 2019 ] Eval epoch: 29 [ Thu Oct 17 20:55:51 2019 ] Mean test loss of 398 batches: 1.0555783161266366. [ Thu Oct 17 20:55:51 2019 ] Top1: 69.47% [ Thu Oct 17 20:55:51 2019 ] Top5: 91.72% [ Thu Oct 17 20:55:51 2019 ] Training epoch: 30, LR: 0.1000 [ Thu Oct 17 21:16:48 2019 ] Mean training loss: 0.7284 (BS 32: 0.7284). [ Thu Oct 17 21:16:48 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 21:16:49 2019 ] Eval epoch: 30 [ Thu Oct 17 21:20:04 2019 ] Mean test loss of 398 batches: 1.0128039615837174. [ Thu Oct 17 21:20:04 2019 ] Top1: 70.89% [ Thu Oct 17 21:20:04 2019 ] Top5: 92.12% [ Thu Oct 17 21:20:04 2019 ] Training epoch: 31, LR: 0.0100 [ Thu Oct 17 21:41:02 2019 ] Mean training loss: 0.4083 (BS 32: 0.4083). [ Thu Oct 17 21:41:02 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 21:41:02 2019 ] Eval epoch: 31 [ Thu Oct 17 21:44:17 2019 ] Mean test loss of 398 batches: 0.6125229695184746. [ Thu Oct 17 21:44:17 2019 ] Top1: 81.57% [ Thu Oct 17 21:44:18 2019 ] Top5: 96.09% [ Thu Oct 17 21:44:18 2019 ] Training epoch: 32, LR: 0.0100 [ Thu Oct 17 22:05:16 2019 ] Mean training loss: 0.3216 (BS 32: 0.3216). [ Thu Oct 17 22:05:16 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 22:05:16 2019 ] Eval epoch: 32 [ Thu Oct 17 22:08:31 2019 ] Mean test loss of 398 batches: 0.6043433589701677. [ Thu Oct 17 22:08:32 2019 ] Top1: 81.98% [ Thu Oct 17 22:08:32 2019 ] Top5: 96.19% [ Thu Oct 17 22:08:32 2019 ] Training epoch: 33, LR: 0.0100 [ Thu Oct 17 22:29:30 2019 ] Mean training loss: 0.2786 (BS 32: 0.2786). [ Thu Oct 17 22:29:30 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 22:29:30 2019 ] Eval epoch: 33 [ Thu Oct 17 22:32:46 2019 ] Mean test loss of 398 batches: 0.592108144532496. [ Thu Oct 17 22:32:46 2019 ] Top1: 82.61% [ Thu Oct 17 22:32:46 2019 ] Top5: 96.32% [ Thu Oct 17 22:32:46 2019 ] Training epoch: 34, LR: 0.0100 [ Thu Oct 17 22:53:45 2019 ] Mean training loss: 0.2505 (BS 32: 0.2505). [ Thu Oct 17 22:53:45 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 22:53:45 2019 ] Eval epoch: 34 [ Thu Oct 17 22:57:00 2019 ] Mean test loss of 398 batches: 0.6171068010018699. [ Thu Oct 17 22:57:01 2019 ] Top1: 82.15% [ Thu Oct 17 22:57:01 2019 ] Top5: 96.05% [ Thu Oct 17 22:57:01 2019 ] Training epoch: 35, LR: 0.0100 [ Thu Oct 17 23:18:00 2019 ] Mean training loss: 0.2270 (BS 32: 0.2270). [ Thu Oct 17 23:18:00 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 23:18:00 2019 ] Eval epoch: 35 [ Thu Oct 17 23:21:15 2019 ] Mean test loss of 398 batches: 0.6082726146408062. [ Thu Oct 17 23:21:15 2019 ] Top1: 82.55% [ Thu Oct 17 23:21:16 2019 ] Top5: 96.37% [ Thu Oct 17 23:21:16 2019 ] Training epoch: 36, LR: 0.0100 [ Thu Oct 17 23:42:14 2019 ] Mean training loss: 0.2055 (BS 32: 0.2055). [ Thu Oct 17 23:42:14 2019 ] Time consumption: [Data]00%, [Network]97% [ Thu Oct 17 23:42:14 2019 ] Eval epoch: 36 [ Thu Oct 17 23:45:29 2019 ] Mean test loss of 398 batches: 0.6901925380655269. [ Thu Oct 17 23:45:30 2019 ] Top1: 81.43% [ Thu Oct 17 23:45:30 2019 ] Top5: 95.57% [ Thu Oct 17 23:45:30 2019 ] Training epoch: 37, LR: 0.0100 [ Fri Oct 18 00:06:29 2019 ] Mean training loss: 0.1906 (BS 32: 0.1906). [ Fri Oct 18 00:06:29 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 00:06:29 2019 ] Eval epoch: 37 [ Fri Oct 18 00:09:44 2019 ] Mean test loss of 398 batches: 0.8876037868722599. [ Fri Oct 18 00:09:44 2019 ] Top1: 77.36% [ Fri Oct 18 00:09:45 2019 ] Top5: 93.41% [ Fri Oct 18 00:09:45 2019 ] Training epoch: 38, LR: 0.0100 [ Fri Oct 18 00:30:43 2019 ] Mean training loss: 0.1726 (BS 32: 0.1726). [ Fri Oct 18 00:30:43 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 00:30:43 2019 ] Eval epoch: 38 [ Fri Oct 18 00:33:59 2019 ] Mean test loss of 398 batches: 0.6472045184679367. [ Fri Oct 18 00:33:59 2019 ] Top1: 81.82% [ Fri Oct 18 00:33:59 2019 ] Top5: 96.09% [ Fri Oct 18 00:33:59 2019 ] Training epoch: 39, LR: 0.0100 [ Fri Oct 18 00:54:57 2019 ] Mean training loss: 0.1614 (BS 32: 0.1614). [ Fri Oct 18 00:54:57 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 00:54:57 2019 ] Eval epoch: 39 [ Fri Oct 18 00:58:12 2019 ] Mean test loss of 398 batches: 0.6991648062988741. [ Fri Oct 18 00:58:13 2019 ] Top1: 81.27% [ Fri Oct 18 00:58:13 2019 ] Top5: 95.68% [ Fri Oct 18 00:58:13 2019 ] Training epoch: 40, LR: 0.0100 [ Fri Oct 18 01:19:11 2019 ] Mean training loss: 0.1547 (BS 32: 0.1547). [ Fri Oct 18 01:19:11 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 01:19:11 2019 ] Eval epoch: 40 [ Fri Oct 18 01:22:26 2019 ] Mean test loss of 398 batches: 0.8470202943487982. [ Fri Oct 18 01:22:27 2019 ] Top1: 78.48% [ Fri Oct 18 01:22:27 2019 ] Top5: 94.25% [ Fri Oct 18 01:22:27 2019 ] Training epoch: 41, LR: 0.0100 [ Fri Oct 18 01:43:25 2019 ] Mean training loss: 0.1477 (BS 32: 0.1477). [ Fri Oct 18 01:43:25 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 01:43:25 2019 ] Eval epoch: 41 [ Fri Oct 18 01:46:40 2019 ] Mean test loss of 398 batches: 0.7019791379945362. [ Fri Oct 18 01:46:40 2019 ] Top1: 81.47% [ Fri Oct 18 01:46:41 2019 ] Top5: 95.63% [ Fri Oct 18 01:46:41 2019 ] Training epoch: 42, LR: 0.0100 [ Fri Oct 18 02:07:39 2019 ] Mean training loss: 0.1414 (BS 32: 0.1414). [ Fri Oct 18 02:07:39 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 02:07:39 2019 ] Eval epoch: 42 [ Fri Oct 18 02:10:54 2019 ] Mean test loss of 398 batches: 0.7431180766928736. [ Fri Oct 18 02:10:54 2019 ] Top1: 80.58% [ Fri Oct 18 02:10:55 2019 ] Top5: 95.53% [ Fri Oct 18 02:10:55 2019 ] Training epoch: 43, LR: 0.0100 [ Fri Oct 18 02:31:52 2019 ] Mean training loss: 0.1378 (BS 32: 0.1378). [ Fri Oct 18 02:31:52 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 02:31:52 2019 ] Eval epoch: 43 [ Fri Oct 18 02:35:07 2019 ] Mean test loss of 398 batches: 0.6990684733618444. [ Fri Oct 18 02:35:08 2019 ] Top1: 81.29% [ Fri Oct 18 02:35:08 2019 ] Top5: 95.67% [ Fri Oct 18 02:35:08 2019 ] Training epoch: 44, LR: 0.0100 [ Fri Oct 18 02:56:06 2019 ] Mean training loss: 0.1355 (BS 32: 0.1355). [ Fri Oct 18 02:56:06 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 02:56:06 2019 ] Eval epoch: 44 [ Fri Oct 18 02:59:22 2019 ] Mean test loss of 398 batches: 0.7958477632184724. [ Fri Oct 18 02:59:22 2019 ] Top1: 79.64% [ Fri Oct 18 02:59:22 2019 ] Top5: 94.66% [ Fri Oct 18 02:59:22 2019 ] Training epoch: 45, LR: 0.0100 [ Fri Oct 18 03:20:20 2019 ] Mean training loss: 0.1314 (BS 32: 0.1314). [ Fri Oct 18 03:20:20 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 03:20:20 2019 ] Eval epoch: 45 [ Fri Oct 18 03:23:36 2019 ] Mean test loss of 398 batches: 0.821840450167656. [ Fri Oct 18 03:23:36 2019 ] Top1: 79.39% [ Fri Oct 18 03:23:36 2019 ] Top5: 94.68% [ Fri Oct 18 03:23:36 2019 ] Training epoch: 46, LR: 0.0100 [ Fri Oct 18 03:44:35 2019 ] Mean training loss: 0.1340 (BS 32: 0.1340). [ Fri Oct 18 03:44:35 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 03:44:35 2019 ] Eval epoch: 46 [ Fri Oct 18 03:47:50 2019 ] Mean test loss of 398 batches: 0.7606141108214556. [ Fri Oct 18 03:47:50 2019 ] Top1: 80.34% [ Fri Oct 18 03:47:50 2019 ] Top5: 95.32% [ Fri Oct 18 03:47:50 2019 ] Training epoch: 47, LR: 0.0100 [ Fri Oct 18 04:08:48 2019 ] Mean training loss: 0.1253 (BS 32: 0.1253). [ Fri Oct 18 04:08:48 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 04:08:48 2019 ] Eval epoch: 47 [ Fri Oct 18 04:12:03 2019 ] Mean test loss of 398 batches: 0.7522971067746081. [ Fri Oct 18 04:12:03 2019 ] Top1: 80.46% [ Fri Oct 18 04:12:04 2019 ] Top5: 95.28% [ Fri Oct 18 04:12:04 2019 ] Training epoch: 48, LR: 0.0100 [ Fri Oct 18 04:33:02 2019 ] Mean training loss: 0.1285 (BS 32: 0.1285). [ Fri Oct 18 04:33:02 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 04:33:02 2019 ] Eval epoch: 48 [ Fri Oct 18 04:36:17 2019 ] Mean test loss of 398 batches: 0.7942492714778862. [ Fri Oct 18 04:36:18 2019 ] Top1: 79.72% [ Fri Oct 18 04:36:18 2019 ] Top5: 95.06% [ Fri Oct 18 04:36:18 2019 ] Training epoch: 49, LR: 0.0100 [ Fri Oct 18 04:57:17 2019 ] Mean training loss: 0.1271 (BS 32: 0.1271). [ Fri Oct 18 04:57:17 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 04:57:17 2019 ] Eval epoch: 49 [ Fri Oct 18 05:00:32 2019 ] Mean test loss of 398 batches: 0.9735610334567688. [ Fri Oct 18 05:00:33 2019 ] Top1: 76.89% [ Fri Oct 18 05:00:33 2019 ] Top5: 92.87% [ Fri Oct 18 05:00:33 2019 ] Training epoch: 50, LR: 0.0100 [ Fri Oct 18 05:21:31 2019 ] Mean training loss: 0.1247 (BS 32: 0.1247). [ Fri Oct 18 05:21:31 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 05:21:31 2019 ] Eval epoch: 50 [ Fri Oct 18 05:24:46 2019 ] Mean test loss of 398 batches: 0.7418166076268383. [ Fri Oct 18 05:24:46 2019 ] Top1: 80.81% [ Fri Oct 18 05:24:47 2019 ] Top5: 95.35% [ Fri Oct 18 05:24:47 2019 ] Training epoch: 51, LR: 0.0010 [ Fri Oct 18 05:45:45 2019 ] Mean training loss: 0.0615 (BS 32: 0.0615). [ Fri Oct 18 05:45:45 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 05:45:45 2019 ] Eval epoch: 51 [ Fri Oct 18 05:49:00 2019 ] Mean test loss of 398 batches: 0.6678198344593671. [ Fri Oct 18 05:49:01 2019 ] Top1: 82.82% [ Fri Oct 18 05:49:01 2019 ] Top5: 96.03% [ Fri Oct 18 05:49:01 2019 ] Training epoch: 52, LR: 0.0010 [ Fri Oct 18 06:09:59 2019 ] Mean training loss: 0.0430 (BS 32: 0.0430). [ Fri Oct 18 06:09:59 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 06:09:59 2019 ] Eval epoch: 52 [ Fri Oct 18 06:13:14 2019 ] Mean test loss of 398 batches: 0.6706003864581261. [ Fri Oct 18 06:13:14 2019 ] Top1: 82.99% [ Fri Oct 18 06:13:14 2019 ] Top5: 96.05% [ Fri Oct 18 06:13:14 2019 ] Training epoch: 53, LR: 0.0010 [ Fri Oct 18 06:34:13 2019 ] Mean training loss: 0.0366 (BS 32: 0.0366). [ Fri Oct 18 06:34:13 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 06:34:13 2019 ] Eval epoch: 53 [ Fri Oct 18 06:37:28 2019 ] Mean test loss of 398 batches: 0.6650067282531729. [ Fri Oct 18 06:37:28 2019 ] Top1: 83.21% [ Fri Oct 18 06:37:29 2019 ] Top5: 96.09% [ Fri Oct 18 06:37:29 2019 ] Training epoch: 54, LR: 0.0010 [ Fri Oct 18 06:58:27 2019 ] Mean training loss: 0.0339 (BS 32: 0.0339). [ Fri Oct 18 06:58:27 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 06:58:27 2019 ] Eval epoch: 54 [ Fri Oct 18 07:01:42 2019 ] Mean test loss of 398 batches: 0.6645860161314059. [ Fri Oct 18 07:01:43 2019 ] Top1: 83.32% [ Fri Oct 18 07:01:43 2019 ] Top5: 96.07% [ Fri Oct 18 07:01:43 2019 ] Training epoch: 55, LR: 0.0010 [ Fri Oct 18 07:22:41 2019 ] Mean training loss: 0.0315 (BS 32: 0.0315). [ Fri Oct 18 07:22:41 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 07:22:41 2019 ] Eval epoch: 55 [ Fri Oct 18 07:25:56 2019 ] Mean test loss of 398 batches: 0.673115755407954. [ Fri Oct 18 07:25:56 2019 ] Top1: 83.06% [ Fri Oct 18 07:25:57 2019 ] Top5: 96.03% [ Fri Oct 18 07:25:57 2019 ] Training epoch: 56, LR: 0.0010 [ Fri Oct 18 07:46:55 2019 ] Mean training loss: 0.0298 (BS 32: 0.0298). [ Fri Oct 18 07:46:55 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 07:46:55 2019 ] Eval epoch: 56 [ Fri Oct 18 07:50:10 2019 ] Mean test loss of 398 batches: 0.6660301350244325. [ Fri Oct 18 07:50:11 2019 ] Top1: 83.34% [ Fri Oct 18 07:50:11 2019 ] Top5: 96.06% [ Fri Oct 18 07:50:11 2019 ] Training epoch: 57, LR: 0.0010 [ Fri Oct 18 08:11:10 2019 ] Mean training loss: 0.0280 (BS 32: 0.0280). [ Fri Oct 18 08:11:10 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 08:11:10 2019 ] Eval epoch: 57 [ Fri Oct 18 08:14:26 2019 ] Mean test loss of 398 batches: 0.6718065027316013. [ Fri Oct 18 08:14:26 2019 ] Top1: 83.10% [ Fri Oct 18 08:14:26 2019 ] Top5: 95.96% [ Fri Oct 18 08:14:26 2019 ] Training epoch: 58, LR: 0.0010 [ Fri Oct 18 08:35:26 2019 ] Mean training loss: 0.0276 (BS 32: 0.0276). [ Fri Oct 18 08:35:26 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 08:35:26 2019 ] Eval epoch: 58 [ Fri Oct 18 08:38:41 2019 ] Mean test loss of 398 batches: 0.6752176670423106. [ Fri Oct 18 08:38:41 2019 ] Top1: 83.14% [ Fri Oct 18 08:38:42 2019 ] Top5: 95.93% [ Fri Oct 18 08:38:42 2019 ] Training epoch: 59, LR: 0.0010 [ Fri Oct 18 08:59:40 2019 ] Mean training loss: 0.0266 (BS 32: 0.0266). [ Fri Oct 18 08:59:40 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 08:59:41 2019 ] Eval epoch: 59 [ Fri Oct 18 09:02:55 2019 ] Mean test loss of 398 batches: 0.6814049008728271. [ Fri Oct 18 09:02:56 2019 ] Top1: 82.99% [ Fri Oct 18 09:02:56 2019 ] Top5: 95.98% [ Fri Oct 18 09:02:56 2019 ] Training epoch: 60, LR: 0.0010 [ Fri Oct 18 09:23:55 2019 ] Mean training loss: 0.0259 (BS 32: 0.0259). [ Fri Oct 18 09:23:55 2019 ] Time consumption: [Data]00%, [Network]97% [ Fri Oct 18 09:23:55 2019 ] Eval epoch: 60 [ Fri Oct 18 09:27:10 2019 ] Mean test loss of 398 batches: 0.6708043590936829. [ Fri Oct 18 09:27:11 2019 ] Top1: 83.20% [ Fri Oct 18 09:27:11 2019 ] Top5: 96.07% [ Fri Oct 18 09:27:11 2019 ] Forward Batch Size: 64 [ Fri Oct 18 09:27:11 2019 ] Best accuracy: 0.833402069954241 [ Fri Oct 18 09:27:11 2019 ] Epoch Number: 56 [ Fri Oct 18 09:27:11 2019 ] Model Name: ./runs/90-85b-ntu120-xsub
Hi, thanks for your excellent work. But I can't reproduce the result on NTU120 X-sub. The best accuracy I got is 82.22%. The command I run is
The apex is python-only. Do you have some idea about what happened? Looking forward to your reply.