Training seems doesn't work

highway007 commented 5 years ago

I had trained on my own dataset over 12 hours on 1080ti ,but the result still Prec@1 13.333, Loss 1.98.My dataset is 7 classes and 100 video-clips for each class. Could anyone give me some suggests ?

highway007 commented 5 years ago

And here is the result: **CUDA_VISIBLE_DEVICES=0 python main.py something RGB --arch BNInception --num_segments 8 --consensus_type TRNmultiscale --batch-size 8

categories, args.train_list, args.val_list, args.root_path ['clap', 'jogging', 'pjump', 'running', 'walking', 'wave1', 'wave2'] video_datasets/action5/train_videofolder.txt video_datasets/action5/val_videofolder.txt actions
storing name: TRN_something_RGB_BNInception_TRNmultiscale_segment8

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       8
        new_length:         1
        consensus_module:   TRNmultiscale
        dropout_ratio:      0.8
        img_feature_dim:    256

/home/unimation/Downloads/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_. normal(self.newfc.weight, 0, std) /home/unimation/Downloads/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant. constant(self.new_fc.bias, 0) Multi-Scale Temporal Relation Network Module in use ['8-frame relation', '7-frame relation', '6-frame relation', '5-frame relation', '4-frame relation', '3-frame relation', '2-frame relation'] video number:509 TSNDataSet!!!!!!!!! <class 'dataset.TSNDataSet'> /home/unimation/.local/lib/python3.6/site-packages/torchvision/transforms/transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead. "please use transforms.Resize instead.") video number:225 group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1 group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0 group: normal_weight has 83 params, lr_mult: 1, decay_mult: 1 group: normal_bias has 83 params, lr_mult: 2, decay_mult: 0 group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0 Freezing BatchNorm2D except the first one. main.py:176: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number losses.update(loss.data[0], input.size(0)) main.py:177: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number top1.update(prec1[0], input.size(0)) main.py:178: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number top5.update(prec5[0], input.size(0)) main.py:187: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_gradnorm. total_norm = clip_grad_norm(model.parameters(), args.clip_gradient) Epoch: [0][0/64], lr: 0.10000 Time 5.951 (5.951) Data 3.288 (3.288) Loss 1.8293 (1.8293) Prec@1 37.500 (37.500) Prec@5 100.000 (100.000) Epoch: [0][20/64], lr: 0.10000 Time 0.179 (0.453) Data 0.000 (0.157) Loss 9.1152 (6.3340) Prec@1 25.000 (22.024) Prec@5 75.000 (80.357) Epoch: [0][40/64], lr: 0.10000 Time 0.179 (0.320) Data 0.000 (0.080) Loss 12.0552 (9.9163) Prec@1 50.000 (19.512) Prec@5 62.500 (73.171) Epoch: [0][60/64], lr: 0.10000 Time 0.179 (0.274) Data 0.000 (0.054) Loss 11.5334 (11.0733) Prec@1 25.000 (16.803) Prec@5 87.500 (71.107) Freezing BatchNorm2D except the first one. Epoch: [1][0/64], lr: 0.10000 Time 3.882 (3.882) Data 3.694 (3.694) Loss 11.1950 (11.1950) Prec@1 25.000 (25.000) Prec@5 87.500 (87.500) Epoch: [1][20/64], lr: 0.10000 Time 0.179 (0.356) Data 0.000 (0.176) Loss 12.6795 (14.5755) Prec@1 0.000 (17.857) Prec@5 50.000 (61.905) Epoch: [1][40/64], lr: 0.10000 Time 0.179 (0.270) Data 0.000 (0.090) Loss 7.2219 (11.9801) Prec@1 50.000 (17.683) Prec@5 87.500 (66.463) Epoch: [1][60/64], lr: 0.10000 Time 0.187 (0.241) Data 0.000 (0.061) Loss 22.3213 (12.3004) Prec@1 0.000 (18.238) Prec@5 50.000 (68.852) Freezing BatchNorm2D except the first one. Epoch: [2][0/64], lr: 0.10000 Time 3.947 (3.947) Data 3.747 (3.747) Loss 20.9917 (20.9917) Prec@1 12.500 (12.500) Prec@5 75.000 (75.000) Epoch: [2][20/64], lr: 0.10000 Time 0.179 (0.360) Data 0.000 (0.178) Loss 9.8333 (18.3747) Prec@1 12.500 (14.881) Prec@5 75.000 (77.976) Epoch: [2][40/64], lr: 0.10000 Time 0.179 (0.272) Data 0.000 (0.091) Loss 5.6862 (14.3118) Prec@1 0.000 (14.024) Prec@5 62.500 (73.476) Epoch: [2][60/64], lr: 0.10000 Time 0.179 (0.241) Data 0.000 (0.061) Loss 24.2211 (13.2098) Prec@1 0.000 (14.139) Prec@5 62.500 (72.746) Freezing BatchNorm2D except the first one. Epoch: [3][0/64], lr: 0.10000 Time 3.182 (3.182) Data 2.873 (2.873) Loss 4.0872 (4.0872) Prec@1 25.000 (25.000) Prec@5 87.500 (87.500) Epoch: [3][20/64], lr: 0.10000 Time 0.179 (0.339) Data 0.000 (0.146) Loss 5.8109 (10.8753) Prec@1 0.000 (10.119) Prec@5 37.500 (70.833) Epoch: [3][40/64], lr: 0.10000 Time 0.179 (0.261) Data 0.000 (0.075) Loss 8.8069 (13.0286) Prec@1 12.500 (9.451) Prec@5 75.000 (69.817) Epoch: [3][60/64], lr: 0.10000 Time 0.179 (0.234) Data 0.000 (0.050) Loss 12.9770 (11.8553) Prec@1 0.000 (11.475) Prec@5 62.500 (70.902) Freezing BatchNorm2D except the first one. Epoch: [4][0/64], lr: 0.10000 Time 3.673 (3.673) Data 3.471 (3.471) Loss 14.4737 (14.4737) Prec@1 0.000 (0.000) Prec@5 75.000 (75.000) Epoch: [4][20/64], lr: 0.10000 Time 0.179 (0.347) Data 0.000 (0.165) Loss 22.3581 (10.4863) Prec@1 0.000 (13.690) Prec@5 62.500 (72.024) Epoch: [4][40/64], lr: 0.10000 Time 0.179 (0.265) Data 0.000 (0.085) Loss 15.2673 (10.9867) Prec@1 25.000 (14.024) Prec@5 75.000 (73.171) Epoch: [4][60/64], lr: 0.10000 Time 0.179 (0.237) Data 0.000 (0.057) Loss 3.1066 (12.0587) Prec@1 50.000 (13.525) Prec@5 87.500 (71.926) Freezing BatchNorm2D except the first one. main.py:224: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. input_var = torch.autograd.Variable(input, volatile=True) main.py:225: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. target_var = torch.autograd.Variable(target, volatile=True) main.py:234: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number losses.update(loss.data[0], input.size(0)) main.py:235: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number top1.update(prec1[0], input.size(0)) main.py:236: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number top5.update(prec5[0], input.size(0)) Test: [0/29] Time 2.315 (2.315) Loss 2.8118 (2.8118) Prec@1 0.000 (0.000) Prec@5 100.000 (100.000) Test: [20/29] Time 0.055 (0.173) Loss 4.3632 (11.3409) Prec@1 0.000 (25.595) Prec@5 100.000 (60.714) Testing Results: Prec@1 19.111 Prec@5 70.667 Loss 9.98876

Best Prec@1: 0.000 Freezing BatchNorm2D except the first one. Epoch: [5][0/64], lr: 0.10000 Time 3.592 (3.592) Data 3.380 (3.380) Loss 10.8174 (10.8174) Prec@1 0.000 (0.000) Prec@5 75.000 (75.000) Epoch: [5][20/64], lr: 0.10000 Time 0.179 (0.344) Data 0.000 (0.161) Loss 14.8842 (10.3203) Prec@1 12.500 (14.286) Prec@5 50.000 (69.048) Epoch: [5][40/64], lr: 0.10000 Time 0.179 (0.263) Data 0.000 (0.082) Loss 19.0013 (10.8219) Prec@1 0.000 (12.805) Prec@5 50.000 (67.378) Epoch: [5][60/64], lr: 0.10000 Time 0.179 (0.236) Data 0.000 (0.055) Loss 11.4083 (12.3493) Prec@1 12.500 (12.295) Prec@5 87.500 (69.467) Freezing BatchNorm2D except the first one. Epoch: [6][0/64], lr: 0.10000 Time 2.335 (2.335) Data 2.023 (2.023) Loss 25.1099 (25.1099) Prec@1 12.500 (12.500) Prec@5 50.000 (50.000) Epoch: [6][20/64], lr: 0.10000 Time 0.179 (0.342) Data 0.000 (0.155) Loss 22.4020 (17.8056) Prec@1 0.000 (13.095) Prec@5 62.500 (66.071) Epoch: [6][40/64], lr: 0.10000 Time 0.179 (0.263) Data 0.000 (0.079) Loss 14.5251 (17.3345) Prec@1 0.000 (13.720) Prec@5 87.500 (68.293) Epoch: [6][60/64], lr: 0.10000 Time 0.179 (0.236) Data 0.000 (0.053) Loss 36.3613 (17.7415) Prec@1 12.500 (14.754) Prec@5 62.500 (69.262) Freezing BatchNorm2D except the first one. Epoch: [7][0/64], lr: 0.10000 Time 3.706 (3.706) Data 3.509 (3.509) Loss 43.4189 (43.4189) Prec@1 0.000 (0.000) Prec@5 37.500 (37.500) Epoch: [7][20/64], lr: 0.10000 Time 0.179 (0.349) Data 0.000 (0.167) Loss 9.8558 (24.9688) Prec@1 12.500 (10.119) Prec@5 87.500 (65.476) Epoch: [7][40/64], lr: 0.10000 Time 0.179 (0.266) Data 0.000 (0.086) Loss 4.1695 (18.8195) Prec@1 12.500 (10.671) Prec@5 100.000 (67.378) Epoch: [7][60/64], lr: 0.10000 Time 0.179 (0.238) Data 0.000 (0.058) Loss 8.4691 (17.4597) Prec@1 37.500 (12.500) Prec@5 75.000 (68.033) Freezing BatchNorm2D except the first one. Epoch: [8][0/64], lr: 0.10000 Time 3.596 (3.596) Data 3.366 (3.366) Loss 6.1200 (6.1200) Prec@1 25.000 (25.000) Prec@5 87.500 (87.500) Epoch: [8][20/64], lr: 0.10000 Time 0.182 (0.345) Data 0.000 (0.160) Loss 20.5494 (10.1454) Prec@1 12.500 (16.071) Prec@5 50.000 (74.405) Epoch: [8][40/64], lr: 0.10000 Time 0.180 (0.264) Data 0.000 (0.082) Loss 11.4593 (11.5590) Prec@1 12.500 (17.073) Prec@5 75.000 (74.695) Epoch: [8][60/64], lr: 0.10000 Time 0.180 (0.237) Data 0.000 (0.055) Loss 22.3038 (11.3144) Prec@1 12.500 (17.418) Prec@5 50.000 (75.615) Freezing BatchNorm2D except the first one. Epoch: [9][0/64], lr: 0.10000 Time 3.711 (3.711) Data 3.517 (3.517) Loss 15.7564 (15.7564) Prec@1 0.000 (0.000) Prec@5 62.500 (62.500) Epoch: [9][20/64], lr: 0.10000 Time 0.179 (0.349) Data 0.000 (0.168) Loss 17.1360 (12.7305) Prec@1 0.000 (15.476) Prec@5 75.000 (70.833) Epoch: [9][40/64], lr: 0.10000 Time 0.179 (0.266) Data 0.000 (0.086) Loss 12.4337 (13.5411) Prec@1 12.500 (14.024) Prec@5 62.500 (75.610) Epoch: [9][60/64], lr: 0.10000 Time 0.179 (0.238) Data 0.000 (0.058) Loss 17.6340 (13.2725) Prec@1 25.000 (13.525) Prec@5 100.000 (72.336) Freezing BatchNorm2D except the first one. Test: [0/29] Time 2.458 (2.458) Loss 6.9910 (6.9910) Prec@1 0.000 (0.000) Prec@5 100.000 (100.000) Test: [20/29] Time 0.055 (0.173) Loss 27.9825 (15.3783) Prec@1 0.000 (19.643) Prec@5 0.000 (80.357) Testing Results: Prec@1 14.667 Prec@5 75.556 Loss 16.38245

Best Prec@1: 19.111 Freezing BatchNorm2D except the first one. Epoch: [10][0/64], lr: 0.10000 Time 2.713 (2.713) Data 2.402 (2.402) Loss 20.8716 (20.8716) Prec@1 0.000 (0.000) Prec@5 62.500 (62.500) Epoch: [10][20/64], lr: 0.10000 Time 0.179 (0.347) Data 0.000 (0.160) Loss 19.2367 (15.6610) Prec@1 12.500 (13.690) Prec@5 50.000 (70.833) Epoch: [10][40/64], lr: 0.10000 Time 0.179 (0.265) Data 0.000 (0.082) Loss 8.9847 (18.1285) Prec@1 25.000 (13.110) Prec@5 100.000 (70.732) Epoch: [10][60/64], lr: 0.10000 Time 0.179 (0.237) Data 0.000 (0.055) Loss 3.7958 (17.1844) Prec@1 0.000 (13.115) Prec@5 100.000 (72.951) Freezing BatchNorm2D except the first one. Epoch: [11][0/64], lr: 0.10000 Time 2.381 (2.381) Data 2.069 (2.069) Loss 10.4690 (10.4690) Prec@1 12.500 (12.500) Prec@5 62.500 (62.500) Epoch: [11][20/64], lr: 0.10000 Time 0.179 (0.347) Data 0.000 (0.159) Loss 18.7608 (13.7999) Prec@1 0.000 (17.857) Prec@5 50.000 (73.810) Epoch: [11][40/64], lr: 0.10000 Time 0.179 (0.265) Data 0.000 (0.082) Loss 16.0519 (14.2114) Prec@1 0.000 (17.378) Prec@5 50.000 (73.476) Epoch: [11][60/64], lr: 0.10000 Time 0.179 (0.237) Data 0.000 (0.055) Loss 9.2359 (15.2699) Prec@1 12.500 (17.213) Prec@5 62.500 (71.516) Freezing BatchNorm2D except the first one. Epoch: [12][0/64], lr: 0.10000 Time 3.778 (3.778) Data 3.569 (3.569) Loss 20.7168 (20.7168) Prec@1 25.000 (25.000) Prec@5 50.000 (50.000) Epoch: [12][20/64], lr: 0.10000 Time 0.179 (0.352) Data 0.000 (0.170) Loss 4.6315 (11.5795) Prec@1 0.000 (8.929) Prec@5 100.000 (71.429) Epoch: [12][40/64], lr: 0.10000 Time 0.179 (0.268) Data 0.000 (0.087) Loss 14.7752 (13.5830) Prec@1 12.500 (12.500) Prec@5 50.000 (69.817) Epoch: [12][60/64], lr: 0.10000 Time 0.179 (0.239) Data 0.000 (0.059) Loss 25.2227 (14.6595) Prec@1 12.500 (14.344) Prec@5 50.000 (69.672) Freezing BatchNorm2D except the first one. Epoch: [13][0/64], lr: 0.10000 Time 3.928 (3.928) Data 3.738 (3.738) Loss 19.0730 (19.0730) Prec@1 0.000 (0.000) Prec@5 50.000 (50.000) Epoch: [13][20/64], lr: 0.10000 Time 0.179 (0.359) Data 0.000 (0.178) Loss 11.6131 (13.2334) Prec@1 0.000 (12.500) Prec@5 75.000 (66.667) Epoch: [13][40/64], lr: 0.10000 Time 0.179 (0.271) Data 0.000 (0.091) Loss 9.5359 (13.7782) Prec@1 0.000 (10.976) Prec@5 87.500 (68.293) Epoch: [13][60/64], lr: 0.10000 Time 0.179 (0.241) Data 0.000 (0.061) Loss 13.7183 (16.3174) Prec@1 12.500 (10.451) Prec@5 62.500 (68.852) Freezing BatchNorm2D except the first one. Epoch: [14][0/64], lr: 0.10000 Time 3.660 (3.660) Data 3.466 (3.466) Loss 35.3793 (35.3793) Prec@1 12.500 (12.500) Prec@5 62.500 (62.500) Epoch: [14][20/64], lr: 0.10000 Time 0.179 (0.346) Data 0.000 (0.165) Loss 17.0676 (22.2747) Prec@1 12.500 (13.095) Prec@5 87.500 (64.881) Epoch: [14][40/64], lr: 0.10000 Time 0.179 (0.265) Data 0.000 (0.085) Loss 14.4238 (17.8209) Prec@1 12.500 (12.805) Prec@5 75.000 (70.122) Epoch: [14][60/64], lr: 0.10000 Time 0.179 (0.237) Data 0.000 (0.057) Loss 24.2772 (16.4661) Prec@1 25.000 (15.164) Prec@5 37.500 (70.082) Freezing BatchNorm2D except the first one. Test: [0/29] Time 2.304 (2.304) Loss 5.8167 (5.8167) Prec@1 0.000 (0.000) Prec@5 100.000 (100.000) Test: [20/29] Time 0.055 (0.164) Loss 0.4450 (13.3817) Prec@1 100.000 (4.762) Prec@5 100.000 (54.762) Testing Results: Prec@1 13.333 Prec@5 66.222 Loss 11.53807

Best Prec@1: 19.111 ... ... ... Freezing BatchNorm2D except the first one. Epoch: [615][0/64], lr: 0.00100 Time 3.582 (3.582) Data 3.370 (3.370) Loss 2.0563 (2.0563) Prec@1 0.000 (0.000) Prec@5 75.000 (75.000) Epoch: [615][20/64], lr: 0.00100 Time 0.179 (0.343) Data 0.000 (0.161) Loss 2.0766 (1.9878) Prec@1 12.500 (13.095) Prec@5 50.000 (69.048) Epoch: [615][40/64], lr: 0.00100 Time 0.179 (0.263) Data 0.000 (0.082) Loss 2.0474 (1.9905) Prec@1 0.000 (13.415) Prec@5 62.500 (67.988) Epoch: [615][60/64], lr: 0.00100 Time 0.179 (0.236) Data 0.000 (0.055) Loss 1.9727 (1.9783) Prec@1 12.500 (14.549) Prec@5 87.500 (70.287) Freezing BatchNorm2D except the first one. Epoch: [616][0/64], lr: 0.00100 Time 3.793 (3.793) Data 3.603 (3.603) Loss 2.0553 (2.0553) Prec@1 12.500 (12.500) Prec@5 75.000 (75.000) Epoch: [616][20/64], lr: 0.00100 Time 0.180 (0.354) Data 0.000 (0.173) Loss 1.8473 (1.9688) Prec@1 25.000 (14.881) Prec@5 100.000 (75.595) Epoch: [616][40/64], lr: 0.00100 Time 0.180 (0.269) Data 0.000 (0.089) Loss 1.9023 (1.9710) Prec@1 25.000 (14.329) Prec@5 75.000 (70.732) Epoch: [616][60/64], lr: 0.00100 Time 0.179 (0.240) Data 0.000 (0.060) Loss 1.9997 (1.9809) Prec@1 0.000 (13.934) Prec@5 62.500 (69.467) Freezing BatchNorm2D except the first one. Epoch: [617][0/64], lr: 0.00100 Time 3.846 (3.846) Data 3.656 (3.656) Loss 1.8284 (1.8284) Prec@1 37.500 (37.500) Prec@5 87.500 (87.500) Epoch: [617][20/64], lr: 0.00100 Time 0.180 (0.355) Data 0.000 (0.174) Loss 1.9189 (1.9600) Prec@1 0.000 (13.690) Prec@5 87.500 (76.190) Epoch: [617][40/64], lr: 0.00100 Time 0.179 (0.269) Data 0.000 (0.089) Loss 2.0099 (1.9707) Prec@1 0.000 (13.720) Prec@5 62.500 (70.732) Epoch: [617][60/64], lr: 0.00100 Time 0.179 (0.240) Data 0.000 (0.060) Loss 2.0066 (1.9719) Prec@1 25.000 (14.139) Prec@5 62.500 (71.926) Freezing BatchNorm2D except the first one. Epoch: [618][0/64], lr: 0.00100 Time 3.691 (3.691) Data 3.481 (3.481) Loss 1.8642 (1.8642) Prec@1 25.000 (25.000) Prec@5 87.500 (87.500) Epoch: [618][20/64], lr: 0.00100 Time 0.179 (0.348) Data 0.000 (0.166) Loss 2.0600 (1.9580) Prec@1 12.500 (13.690) Prec@5 62.500 (75.595) Epoch: [618][40/64], lr: 0.00100 Time 0.179 (0.266) Data 0.000 (0.085) Loss 1.9871 (1.9674) Prec@1 12.500 (14.939) Prec@5 75.000 (70.427) Epoch: [618][60/64], lr: 0.00100 Time 0.179 (0.238) Data 0.000 (0.057) Loss 1.9450 (1.9682) Prec@1 0.000 (15.369) Prec@5 87.500 (70.492) Freezing BatchNorm2D except the first one. Epoch: [619][0/64], lr: 0.00100 Time 3.684 (3.684) Data 3.489 (3.489) Loss 1.8888 (1.8888) Prec@1 25.000 (25.000) Prec@5 87.500 (87.500) Epoch: [619][20/64], lr: 0.00100 Time 0.180 (0.348) Data 0.000 (0.167) Loss 1.8461 (1.9608) Prec@1 25.000 (17.262) Prec@5 87.500 (68.452) Epoch: [619][40/64], lr: 0.00100 Time 0.179 (0.266) Data 0.000 (0.085) Loss 2.1105 (1.9742) Prec@1 0.000 (13.415) Prec@5 50.000 (68.598) Epoch: [619][60/64], lr: 0.00100 Time 0.179 (0.238) Data 0.000 (0.057) Loss 2.0625 (1.9756) Prec@1 0.000 (12.090) Prec@5 50.000 (69.262) Freezing BatchNorm2D except the first one. Test: [0/29] Time 2.232 (2.232) Loss 2.1742 (2.1742) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000) Test: [20/29] Time 0.055 (0.164) Loss 1.8359 (1.9653) Prec@1 0.000 (14.881) Prec@5 100.000 (84.524) Testing Results: Prec@1 11.111 Prec@5 72.889 Loss 1.97651

Best Prec@1: 19.111 Freezing BatchNorm2D except the first one. Epoch: [620][0/64], lr: 0.00100 Time 3.143 (3.143) Data 2.829 (2.829) Loss 1.7963 (1.7963) Prec@1 50.000 (50.000) Prec@5 100.000 (100.000) Epoch: [620][20/64], lr: 0.00100 Time 0.179 (0.332) Data 0.000 (0.145) Loss 1.9453 (1.9964) Prec@1 25.000 (13.095) Prec@5 62.500 (72.024) Epoch: [620][40/64], lr: 0.00100 Time 0.179 (0.258) Data 0.000 (0.074) Loss 2.0936 (1.9821) Prec@1 0.000 (14.329) Prec@5 50.000 (72.256) Epoch: [620][60/64], lr: 0.00100 Time 0.180 (0.232) Data 0.000 (0.050) Loss 2.0416 (1.9830) Prec@1 12.500 (12.705) Prec@5 50.000 (70.902) Freezing BatchNorm2D except the first one. Epoch: [621][0/64], lr: 0.00100 Time 3.742 (3.742) Data 3.532 (3.532) Loss 1.9929 (1.9929) Prec@1 0.000 (0.000) Prec@5 62.500 (62.500) Epoch: [621][20/64], lr: 0.00100 Time 0.180 (0.351) Data 0.000 (0.168) Loss 1.9375 (1.9579) Prec@1 12.500 (11.310) Prec@5 87.500 (73.214) Epoch: [621][40/64], lr: 0.00100 Time 0.179 (0.267) Data 0.000 (0.086) Loss 1.9081 (1.9765) Prec@1 25.000 (10.061) Prec@5 87.500 (69.817) Epoch: [621][60/64], lr: 0.00100 Time 0.179 (0.238) Data 0.000 (0.058) Loss 2.0967 (1.9735) Prec@1 0.000 (12.500) Prec@5 37.500 (70.082) Freezing BatchNorm2D except the first one. Epoch: [622][0/64], lr: 0.00100 Time 3.715 (3.715) Data 3.525 (3.525) Loss 1.9688 (1.9688) Prec@1 12.500 (12.500) Prec@5 62.500 (62.500) Epoch: [622][20/64], lr: 0.00100 Time 0.180 (0.349) Data 0.000 (0.168) Loss 1.8196 (1.9501) Prec@1 25.000 (12.500) Prec@5 100.000 (76.190) Epoch: [622][40/64], lr: 0.00100 Time 0.180 (0.267) Data 0.000 (0.086) Loss 1.9542 (1.9704) Prec@1 0.000 (12.195) Prec@5 87.500 (71.646) Epoch: [622][60/64], lr: 0.00100 Time 0.179 (0.238) Data 0.000 (0.058) Loss 2.0329 (1.9721) Prec@1 12.500 (11.066) Prec@5 37.500 (71.516) Freezing BatchNorm2D except the first one. Epoch: [623][0/64], lr: 0.00100 Time 3.255 (3.255) Data 2.979 (2.979) Loss 1.9927 (1.9927) Prec@1 25.000 (25.000) Prec@5 75.000 (75.000) Epoch: [623][20/64], lr: 0.00100 Time 0.180 (0.341) Data 0.000 (0.155) Loss 1.9465 (1.9831) Prec@1 12.500 (12.500) Prec@5 100.000 (70.833) Epoch: [623][40/64], lr: 0.00100 Time 0.179 (0.262) Data 0.000 (0.079) Loss 1.8467 (1.9738) Prec@1 12.500 (13.415) Prec@5 100.000 (69.817) Epoch: [623][60/64], lr: 0.00100 Time 0.179 (0.235) Data 0.000 (0.053) Loss 2.0186 (1.9865) Prec@1 0.000 (12.295) Prec@5 75.000 (70.082) Freezing BatchNorm2D except the first one. Epoch: [624][0/64], lr: 0.00100 Time 2.779 (2.779) Data 2.496 (2.496) Loss 2.0292 (2.0292) Prec@1 0.000 (0.000) Prec@5 75.000 (75.000) Epoch: [624][20/64], lr: 0.00100 Time 0.179 (0.341) Data 0.000 (0.155) Loss 1.9567 (1.9580) Prec@1 12.500 (14.881) Prec@5 87.500 (70.833) Epoch: [624][40/64], lr: 0.00100 Time 0.179 (0.262) Data 0.000 (0.079) Loss 1.9357 (1.9713) Prec@1 0.000 (13.415) Prec@5 75.000 (70.122) Epoch: [624][60/64], lr: 0.00100 Time 0.179 (0.235) Data 0.000 (0.053) Loss 2.1558 (1.9721) Prec@1 0.000 (14.549) Prec@5 50.000 (70.902) Freezing BatchNorm2D except the first one. Test: [0/29] Time 2.553 (2.553) Loss 1.7445 (1.7445) Prec@1 0.000 (0.000) Prec@5 100.000 (100.000) Test: [20/29] Time 0.055 (0.174) Loss 2.0708 (1.9758) Prec@1 0.000 (19.643) Prec@5 0.000 (69.643) Testing Results: Prec@1 14.667 Prec@5 67.556 Loss 1.97754 ... ... ... Best Prec@1: 19.111 Freezing BatchNorm2D except the first one. Epoch: [990][0/64], lr: 0.00100 Time 3.678 (3.678) Data 3.469 (3.469) Loss 1.9765 (1.9765) Prec@1 0.000 (0.000) Prec@5 87.500 (87.500) Epoch: [990][20/64], lr: 0.00100 Time 0.179 (0.347) Data 0.000 (0.165) Loss 1.9878 (1.9750) Prec@1 12.500 (14.881) Prec@5 62.500 (70.238) Epoch: [990][40/64], lr: 0.00100 Time 0.179 (0.265) Data 0.000 (0.085) Loss 1.8221 (1.9809) Prec@1 37.500 (14.024) Prec@5 87.500 (71.646) Epoch: [990][60/64], lr: 0.00100 Time 0.179 (0.237) Data 0.000 (0.057) Loss 2.0065 (1.9776) Prec@1 12.500 (14.344) Prec@5 62.500 (71.516) Freezing BatchNorm2D except the first one. Epoch: [991][0/64], lr: 0.00100 Time 3.315 (3.315) Data 3.056 (3.056) Loss 1.9646 (1.9646) Prec@1 25.000 (25.000) Prec@5 62.500 (62.500) Epoch: [991][20/64], lr: 0.00100 Time 0.179 (0.342) Data 0.000 (0.157) Loss 1.9891 (1.9770) Prec@1 25.000 (16.071) Prec@5 62.500 (72.024) Epoch: [991][40/64], lr: 0.00100 Time 0.179 (0.263) Data 0.000 (0.080) Loss 1.9079 (1.9760) Prec@1 0.000 (15.244) Prec@5 75.000 (68.598) Epoch: [991][60/64], lr: 0.00100 Time 0.179 (0.235) Data 0.000 (0.054) Loss 2.0343 (1.9725) Prec@1 0.000 (14.549) Prec@5 75.000 (68.648) Freezing BatchNorm2D except the first one. Epoch: [992][0/64], lr: 0.00100 Time 3.252 (3.252) Data 2.990 (2.990) Loss 1.9289 (1.9289) Prec@1 25.000 (25.000) Prec@5 75.000 (75.000) Epoch: [992][20/64], lr: 0.00100 Time 0.179 (0.334) Data 0.000 (0.143) Loss 2.0449 (1.9629) Prec@1 0.000 (14.881) Prec@5 50.000 (75.595) Epoch: [992][40/64], lr: 0.00100 Time 0.179 (0.259) Data 0.000 (0.073) Loss 2.0883 (1.9751) Prec@1 12.500 (14.024) Prec@5 75.000 (73.171) Epoch: [992][60/64], lr: 0.00100 Time 0.179 (0.233) Data 0.000 (0.049) Loss 2.0192 (1.9896) Prec@1 0.000 (13.730) Prec@5 75.000 (70.492) Freezing BatchNorm2D except the first one. Epoch: [993][0/64], lr: 0.00100 Time 3.399 (3.399) Data 3.113 (3.113) Loss 1.8708 (1.8708) Prec@1 25.000 (25.000) Prec@5 100.000 (100.000) Epoch: [993][20/64], lr: 0.00100 Time 0.179 (0.336) Data 0.000 (0.148) Loss 2.1559 (1.9719) Prec@1 12.500 (12.500) Prec@5 37.500 (72.619) Epoch: [993][40/64], lr: 0.00100 Time 0.179 (0.260) Data 0.000 (0.076) Loss 2.0889 (1.9944) Prec@1 0.000 (13.110) Prec@5 50.000 (68.902) Epoch: [993][60/64], lr: 0.00100 Time 0.179 (0.233) Data 0.000 (0.051) Loss 1.7947 (1.9888) Prec@1 37.500 (13.934) Prec@5 87.500 (70.287) Freezing BatchNorm2D except the first one. Epoch: [994][0/64], lr: 0.00100 Time 3.120 (3.120) Data 2.816 (2.816) Loss 1.9131 (1.9131) Prec@1 12.500 (12.500) Prec@5 100.000 (100.000) Epoch: [994][20/64], lr: 0.00100 Time 0.179 (0.347) Data 0.000 (0.161) Loss 1.9234 (1.9921) Prec@1 25.000 (8.929) Prec@5 75.000 (66.667) Epoch: [994][40/64], lr: 0.00100 Time 0.179 (0.265) Data 0.000 (0.082) Loss 1.8941 (1.9947) Prec@1 25.000 (8.841) Prec@5 87.500 (68.293) Epoch: [994][60/64], lr: 0.00100 Time 0.179 (0.237) Data 0.000 (0.055) Loss 2.0472 (1.9855) Prec@1 0.000 (9.631) Prec@5 62.500 (70.082) Freezing BatchNorm2D except the first one. Test: [0/29] Time 2.141 (2.141) Loss 1.9659 (1.9659) Prec@1 0.000 (0.000) Prec@5 100.000 (100.000) Test: [20/29] Time 0.055 (0.167) Loss 1.8060 (1.9463) Prec@1 0.000 (19.643) Prec@5 100.000 (65.476) Testing Results: Prec@1 14.667 Prec@5 74.222 Loss 1.95851

Best Prec@1: 19.111 Freezing BatchNorm2D except the first one. Epoch: [995][0/64], lr: 0.00100 Time 2.309 (2.309) Data 1.988 (1.988) Loss 1.9943 (1.9943) Prec@1 12.500 (12.500) Prec@5 50.000 (50.000) Epoch: [995][20/64], lr: 0.00100 Time 0.179 (0.326) Data 0.000 (0.139) Loss 1.9527 (1.9750) Prec@1 25.000 (13.095) Prec@5 62.500 (70.238) Epoch: [995][40/64], lr: 0.00100 Time 0.179 (0.255) Data 0.000 (0.071) Loss 2.0782 (1.9767) Prec@1 0.000 (12.195) Prec@5 50.000 (68.902) Epoch: [995][60/64], lr: 0.00100 Time 0.179 (0.230) Data 0.000 (0.048) Loss 1.8985 (1.9825) Prec@1 12.500 (13.115) Prec@5 87.500 (69.262) Freezing BatchNorm2D except the first one. Epoch: [996][0/64], lr: 0.00100 Time 2.030 (2.030) Data 1.794 (1.794) Loss 1.8999 (1.8999) Prec@1 37.500 (37.500) Prec@5 62.500 (62.500) Epoch: [996][20/64], lr: 0.00100 Time 0.179 (0.336) Data 0.000 (0.146) Loss 1.9917 (1.9856) Prec@1 0.000 (15.476) Prec@5 62.500 (67.857) Epoch: [996][40/64], lr: 0.00100 Time 0.179 (0.259) Data 0.000 (0.075) Loss 1.9657 (1.9779) Prec@1 0.000 (11.890) Prec@5 87.500 (69.512) Epoch: [996][60/64], lr: 0.00100 Time 0.179 (0.233) Data 0.000 (0.050) Loss 1.9769 (1.9819) Prec@1 0.000 (11.885) Prec@5 75.000 (67.828) Freezing BatchNorm2D except the first one. Epoch: [997][0/64], lr: 0.00100 Time 3.637 (3.637) Data 3.397 (3.397) Loss 2.0443 (2.0443) Prec@1 0.000 (0.000) Prec@5 50.000 (50.000) Epoch: [997][20/64], lr: 0.00100 Time 0.180 (0.347) Data 0.000 (0.163) Loss 1.9619 (1.9669) Prec@1 12.500 (10.714) Prec@5 75.000 (69.643) Epoch: [997][40/64], lr: 0.00100 Time 0.180 (0.266) Data 0.000 (0.084) Loss 2.0234 (1.9684) Prec@1 12.500 (11.280) Prec@5 75.000 (71.037) Epoch: [997][60/64], lr: 0.00100 Time 0.180 (0.237) Data 0.000 (0.056) Loss 1.9438 (1.9724) Prec@1 25.000 (11.885) Prec@5 62.500 (69.877) Freezing BatchNorm2D except the first one. Epoch: [998][0/64], lr: 0.00100 Time 3.845 (3.845) Data 3.656 (3.656) Loss 2.0007 (2.0007) Prec@1 0.000 (0.000) Prec@5 87.500 (87.500) Epoch: [998][20/64], lr: 0.00100 Time 0.179 (0.355) Data 0.000 (0.174) Loss 1.9784 (1.9884) Prec@1 12.500 (7.738) Prec@5 75.000 (67.262) Epoch: [998][40/64], lr: 0.00100 Time 0.179 (0.269) Data 0.000 (0.089) Loss 1.9265 (1.9720) Prec@1 25.000 (11.280) Prec@5 75.000 (70.732) Epoch: [998][60/64], lr: 0.00100 Time 0.179 (0.240) Data 0.000 (0.060) Loss 2.0152 (1.9743) Prec@1 12.500 (11.066) Prec@5 50.000 (71.107) Freezing BatchNorm2D except the first one. Epoch: [999][0/64], lr: 0.00100 Time 3.663 (3.663) Data 3.478 (3.478) Loss 1.9811 (1.9811) Prec@1 12.500 (12.500) Prec@5 87.500 (87.500) Epoch: [999][20/64], lr: 0.00100 Time 0.179 (0.346) Data 0.000 (0.166) Loss 1.9657 (1.9835) Prec@1 0.000 (13.095) Prec@5 75.000 (72.619) Epoch: [999][40/64], lr: 0.00100 Time 0.179 (0.265) Data 0.000 (0.085) Loss 1.9165 (1.9786) Prec@1 37.500 (14.634) Prec@5 75.000 (69.817) Epoch: [999][60/64], lr: 0.00100 Time 0.179 (0.237) Data 0.000 (0.057) Loss 1.9463 (1.9687) Prec@1 12.500 (14.344) Prec@5 87.500 (70.902) Freezing BatchNorm2D except the first one. Test: [0/29] Time 2.532 (2.532) Loss 2.4065 (2.4065) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000) Test: [20/29] Time 0.055 (0.173) Loss 2.1484 (2.0385) Prec@1 0.000 (0.000) Prec@5 100.000 (69.643) Testing Results: Prec@1 15.556 Prec@5 77.333 Loss 1.97114

Best Prec@1: 19.111

**

Shanmugavadivelugopal commented 5 years ago

@highway007 hi, I'm also trying to train my own dataset. when I tried to run the training code

python main.py something RGB \
                     --arch BNInception --num_segments 3 \
                     --consensus_type TRN --batch-size 64

Got this error,

storing name: TRN_something_RGB_BNInception_TRN_segment3

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       3
        new_length:         1
        consensus_module:   TRN
        dropout_ratio:      0.8
        img_feature_dim:    256

/content/drive/My Drive/pretrained models/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  normal(self.new_fc.weight, 0, std)
/content/drive/My Drive/pretrained models/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
  constant(self.new_fc.bias, 0)
video number:0
Traceback (most recent call last):
  File "main.py", line 319, in <module>
    main()
  File "main.py", line 83, in main
    pin_memory=True
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 802, in __init__
    sampler = RandomSampler(dataset)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/sampler.py", line 64, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integeral value, but got num_samples=0

Can you help me to solve this issue Thanks in advance.

highway007 commented 5 years ago

@Shanmugavadivelugopal It seems pytorch didn't find your dataset. To use your own dataset , you should change yours same as the format of 20bn-something-something-v1,and change some filename in the codes.If you can run the original code with 20bn-something-something-v1 successful,changing some codes would be useful.(By the way,fine tune is a better way to train your own dataset,I 'm looking for this also)

Malathi15 commented 5 years ago

@highway007 You had trained with own dataset, I also tried to do that. When I run this code for training python main.py something RGB \ --arch BNInception --num_segments 3 \ --consensus_type TRN --batch-size 2

I got this result (Please check the screen shot below) Screenshot (235)

The training is infinitely running and can you clarify about the training process. Thanks.

highway007 commented 5 years ago

@Malathi15 Hi, I think your train didn't work and pytorch can not find the '00001.jpg'. Please check your dataset file ,like something-something/20bn.../02/00001.jpg.May be the name of '.jpg' or '01' isn't right.

Malathi15 commented 5 years ago

@highway007 Thank you. My dataset file looks like this something-something/20bn.../2/00001.jpg. After your suggestion I changed it as something-something/20bn.../02/00001.jpg and it works.

After this I got an RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

storing name: TRN_something_RGB_BNInception_TRN_segment3

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       3
        new_length:         1
        consensus_module:   TRN
        dropout_ratio:      0.8
        img_feature_dim:    256

/content/drive/My Drive/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  normal(self.new_fc.weight, 0, std)
/content/drive/My Drive/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
  constant(self.new_fc.bias, 0)
video number:4
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
video number:1
group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 71 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 71 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
Traceback (most recent call last):
  File "main.py", line 322, in <module>
    main()
  File "main.py", line 128, in main
    train(train_loader, model, criterion, optimizer, epoch, log_training)
  File "main.py", line 171, in train
    loss = criterion(output, target_var)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py", line 904, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1970, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1295, in log_softmax
    ret = input.log_softmax(dim)
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Help me to solve this error.. Thanks.

highway007 commented 5 years ago

@Malathi15 Can I ask what had you done,my bro? Just put your dataset into file then run the main.py is not enough. You should look the datasets_video.py and process_dataset.py,then you will find out what you need doing to train your own dataset. e.g. how to make the list of train and test,and put them to where.So,have you done those before?

Malathi15 commented 5 years ago

@highway007 I'm new to Action Recognition. I want to recognize only one class so I had used 50 videos and extracted into 6 folders using extract_frames.py. Each folder contain 80 to 100 frames. Then I created the train(4 folders for training), test(1 folder for testing)and validation(1 folder for validation) csv files. My labels.csv have only one label. Then the use of process_dataset.py I got the txt files. My datasets are in something-something/20bn-something-something-v1/ and my txt files are in video_dataset/something/

Is it correct or I did anything wrong? Thanks

highway007 commented 5 years ago

@Malathi15 Yes,the number of classification at least 2 or 3 usually. But I think you can set 3 labels for a try.(What' more ,your dataset is so small that would be a overfitting problem)

Malathi15 commented 5 years ago

@highway007 Thanks for reply :), I will increase the classification labels.

highway007 commented 5 years ago

@Malathi15 Waiting for your good news !

Malathi15 commented 5 years ago

Hi @highway007 , I have created 3 classes, still I'm getting this error

storing name: TRN_something_RGB_BNInception_TRN_segment3

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       3
        new_length:         1
        consensus_module:   TRN
        dropout_ratio:      0.8
        img_feature_dim:    256

/content/drive/My Drive/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  normal(self.new_fc.weight, 0, std)
/content/drive/My Drive/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
  constant(self.new_fc.bias, 0)
video number:4
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
video number:1
group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 71 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 71 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
Traceback (most recent call last):
  File "main.py", line 324, in <module>
    main()
  File "main.py", line 128, in main
    train(train_loader, model, criterion, optimizer, epoch, log_training)
  File "main.py", line 175, in train
    prec1, prec5 = accuracy(output.data, target, topk=(1,5))
  File "main.py", line 301, in accuracy
    batch_size = target.size(1)
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

can you help me to get out of this error Thanks

highway007 commented 5 years ago

@Malathi15 Can you check you code in main.py here: File "main.py", line 301, in accuracy batch_size = target.size(1) Original code should be batch_size = target.size(0) And this project will output top5 accuracy, maybe 3-class could be an error.

Malathi15 commented 5 years ago

@highway007 Thanks for your patient reply,I changed the batch_size = target.size(0) then i started the training i got this error

storing name: TRN_something_RGB_BNInception_TRN_segment3

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       3
        new_length:         1
        consensus_module:   TRN
        dropout_ratio:      0.8
        img_feature_dim:    256

/content/drive/My Drive/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  normal(self.new_fc.weight, 0, std)
/content/drive/My Drive/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
  constant(self.new_fc.bias, 0)
video number:4
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
video number:1
group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 71 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 71 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
Traceback (most recent call last):
  File "main.py", line 324, in <module>
    main()
  File "main.py", line 128, in main
    train(train_loader, model, criterion, optimizer, epoch, log_training)
  File "main.py", line 175, in train
    prec1, prec5 = accuracy(output.data, target, topk=(1,5))
  File "main.py", line 304, in accuracy
    _, pred = output.topk(maxk, 1, True, True)
RuntimeError: invalid argument 5: k not in range for dimension at /pytorch/aten/src/THC/generic/THCTensorTopK.cu:21

whether I have to increase the class labels to 5 ? Thanks

highway007 commented 5 years ago

@Malathi15 You can change line 175 and line 232 (1,5) to (1,3) for a try, or set the label-number bigger than 5. :-D

Malathi15 commented 5 years ago

Hi @highway007 Thanks for your patient reply, It works when I changed topk=(1,3) . But after 4 epochs I got this output same as before , Please check the screen shot Screenshot (164) Is that because of class labels? correct me if I'm worng! I will increase my dataset and change the label-number more than 5, and let you know the result. Thanks :)

highway007 commented 5 years ago

@Malathi15 Please check your testfiles' names ,the train run validate() every 5 epochs , and it seems failed to find //05/00001.jpg .

Malathi15 commented 5 years ago

Hi @highway007 , I have a validation folder 05 in the same place as where I have my training folder something-something/20bn-something-something-v1/ . Whether I have to change the validation folder? If so can you tell me where I have to place the folder. Thanks

highway007 commented 5 years ago

@Malathi15 No,it is already in the right place.So this issue is odd.Can you show me train_videofolder.txt and val_videofolder.txt ?

Malathi15 commented 5 years ago

@highway007 I have attached the train and validation text files below val_videofolder.txt train_videofolder.txt My dataset contains only 6 folders, 4 for training, 1 for test, 1 for validation. Thanks

highway007 commented 5 years ago

@Malathi15 Is there 00001.jpg in 05 ? I can't figure out it, but I think it is a path-error .

Malathi15 commented 5 years ago

@highway007 yeah there is a 00001.jpg in 05. Where I have to give a path for validation.txt file?

highway007 commented 5 years ago

@Malathi15 The path of validation.txt is in datasets_video.py .

Malathi15 commented 5 years ago

@highway007 Is there any way I can solve this problem?

highway007 commented 5 years ago

@Malathi15 Duplicating 05 file, change name to 06,07,08,then rewrite the val_videofolder.txt.

Malathi15 commented 5 years ago

@highway007 I duplicated the 05 file and run the training I think it's working Can you pls check the result? Had the model trained for the dataset? Screenshot (167)

highway007 commented 5 years ago

@Malathi15 I suggest to duplicate the dataset(both train and test) 100 times,and set the batch_size bigger if you can.You can see the Loss and Prec@1 to judge it works or not.

Malathi15 commented 5 years ago

Thank you so much @highway007 , I will increase my dataset and let you know the result : )

Shanmugavadivelugopal commented 5 years ago

@highway007 hi, i have trained with my own dataset. Is there any way that i could predict a new video like in pretrained setup? Thanks in advance.

highway007 commented 5 years ago

@Shanmugavadivelugopal Put your pretrain model to model/ ,and write a something_categories.txt in pretrain/ ,then run the instruction like README.

Shanmugavadivelugopal commented 5 years ago

@highway007 I followed your instructions.When i do so i got this error, Can you check it out.. Screenshot (2) Thanks in advance...

highway007 commented 5 years ago

@Shanmugavadivelugopal Hi my friend,if you put your model in model/,you should change your instruction also.Or you can just put it in pretrain/

Shanmugavadivelugopal commented 5 years ago

@highway007 hi, i put the model in pretrain/... when i tried i'm getting the same error,is there any obvious that i'm missing as i'm new to this topic...

highway007 commented 5 years ago

@Shanmugavadivelugopal How did you get the pretrain model ? Wights seems not fit the model of TSN.

Shanmugavadivelugopal commented 5 years ago

@highway007 i trained with my own dataset and the model saved in models/ ... Is that a problem to use the own dataset? Thanks

highway007 commented 5 years ago

@Shanmugavadivelugopal Can you show me your train instruction?Maybe you used a different model for train.

Shanmugavadivelugopal commented 5 years ago

sure @highway007 i followed the same procedure i had replaced the dataset and i run the following code python main.py something RGB \ --arch BNInception --num_segments 3 \ --consensus_type TRN --batch-size 64 Thanks

highway007 commented 5 years ago

@Shanmugavadivelugopal The demo instructions is for multi-scale TRN,but your model is for single scale TRN.You should change it,see it in test_video.py line74-90

Shanmugavadivelugopal commented 5 years ago

Thanks a lot @highway007 i got the result but i got this result with errors,

Multi-Scale Temporal Relation Network Module in use ['8-frame relation', '7-frame relation', '6-frame relation', '5-frame relation', '4-frame relation', '3-frame relation', '2-frame relation']
Freezing BatchNorm2D except the first one.
Loading frames in sample_data/01_frames
RESULT ON sample_data/01_frames
0.438 -> shoplifting
0.301 -> robbery
0.261 -> steal
Traceback (most recent call last):
  File "test_video.py", line 145, in <module>
    print('{:.3f} -> {}'.format(probs[i], categories[idx[i]]))
IndexError: index 3 is out of bounds for dimension 0 with size 3

Can you pls check this... Thanks in advance

highway007 commented 5 years ago

@Shanmugavadivelugopal You set 3 classfication,right?This code output top5 prec,I think change lin144 to range(0,3) will work.And I think you have got the result(top3) you want. :)

Shanmugavadivelugopal commented 5 years ago

Thanks my friend @highway007 I got the result. :)

Shanmugavadivelugopal commented 5 years ago

@highway007 hi,sry to disturb you again,I trained for two labelled classification. ex:(shoplifting,non shoplifting(normal)) But when i tested with my pretrained model the accuracy seems to be constant,for whatever non shoplifting videos the result was same.(normal-0.505 shoplifting-0.495 for all inputs) Can you pls help me get rid of this issue... result

highway007 commented 5 years ago

@Shanmugavadivelugopal Maybe you can download a new video and test it .If the result is different,means you put test-video in train.

Shanmugavadivelugopal commented 5 years ago

hi @highway007 I had increased the labels to 4 but the result seems to be same.There is no change in the class probabilities whatever the video I gave. The result is, Screenshot (184) The training of the model looks like this, Screenshot (183)

In epochs ,throughout the training process I'm getting [0/7], Is this an indication for any bugs in training process. Can you give a suggestion about this... Thanks in advance

highway007 commented 5 years ago

@Shanmugavadivelugopal Sry,I had this question also(like the issue I opened).

Malathi15 commented 5 years ago

Hi @highway007,

I have some clarifications below, Can you please help me with answers?

I tried with 5 classes (shoplifting,normal, stealing, robbery, burglary ), For training I have used 30 videos for shopping, 30 videos for shoplifting, 15 videos for stealing, 15 videos for robbery, 10 videos for burglary. So for my process is I'm using Google colab for training with 12.72 GB RAM I created csv for training,test,validation,labels, My csv files looks like this:

This is my label.csv Screenshot (215)

This is my train.csv Screenshot (216)

This is my test.csv Screenshot (217)

This is my validation.csv Screenshot (218)

My train_videofolder.txt file looks like this

val_videofolder.txt

category.txt

shoplifting
normal
stealing
robbery
burglary

This is my training code

!python3 main.py something RGB \
                     --arch BNInception --num_segments 8 \
                     --consensus_type TRNmultiscale --batch-size 16

My training looks like this

storing name: TRN_something_RGB_BNInception_TRNmultiscale_segment8

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       8
        new_length:         1
        consensus_module:   TRNmultiscale
        dropout_ratio:      0.8
        img_feature_dim:    256

/content/drive/My Drive/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  normal(self.new_fc.weight, 0, std)
/content/drive/My Drive/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
  constant(self.new_fc.bias, 0)
Multi-Scale Temporal Relation Network Module in use ['8-frame relation', '7-frame relation', '6-frame relation', '5-frame relation', '4-frame relation', '3-frame relation', '2-frame relation']
video number:59
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
video number:16
group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 83 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 83 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
main.py:175: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  losses.update(loss.data[0], input.size(0))
main.py:176: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top1.update(prec1[0], input.size(0))
main.py:177: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top5.update(prec5[0], input.size(0))
main.py:186: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
  total_norm = clip_grad_norm(model.parameters(), args.clip_gradient)
Epoch: [0][0/4], lr: 0.00100    Time 16.655 (16.655)    Data 4.173 (4.173)  Loss 1.6147 (1.6147)    Prec@1 50.000 (50.000)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [1][0/4], lr: 0.00100    Time 5.917 (5.917)  Data 4.407 (4.407)  Loss 1.6116 (1.6116)    Prec@1 18.750 (18.750)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [2][0/4], lr: 0.00100    Time 5.543 (5.543)  Data 4.102 (4.102)  Loss 1.4904 (1.4904)    Prec@1 25.000 (25.000)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [3][0/4], lr: 0.00100    Time 6.334 (6.334)  Data 4.920 (4.920)  Loss 1.4325 (1.4325)    Prec@1 25.000 (25.000)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [4][0/4], lr: 0.00100    Time 7.225 (7.225)  Data 5.824 (5.824)  Loss 1.4092 (1.4092)    Prec@1 31.250 (31.250)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
main.py:223: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  input_var = torch.autograd.Variable(input, volatile=True)
main.py:224: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  target_var = torch.autograd.Variable(target, volatile=True)
main.py:233: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  losses.update(loss.data[0], input.size(0))
main.py:234: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top1.update(prec1[0], input.size(0))
main.py:235: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top5.update(prec5[0], input.size(0))
Test: [0/1] Time 1.686 (1.686)  Loss 1.5454 (1.5454)    Prec@1 31.250 (31.250)  Prec@5 100.000 (100.000)
Testing Results: Prec@1 31.250 Prec@5 100.000 Loss 1.54541

Best Prec@1: 0.000
Freezing BatchNorm2D except the first one.
Epoch: [5][0/4], lr: 0.00100    Time 5.674 (5.674)  Data 4.189 (4.189)  Loss 1.4755 (1.4755)    Prec@1 25.000 (25.000)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [6][0/4], lr: 0.00100    Time 4.719 (4.719)  Data 3.302 (3.302)  Loss 1.5275 (1.5275)    Prec@1 31.250 (31.250)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [7][0/4], lr: 0.00100    Time 4.682 (4.682)  Data 3.281 (3.281)  Loss 1.3586 (1.3586)    Prec@1 31.250 (31.250)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [8][0/4], lr: 0.00100    Time 6.715 (6.715)  Data 5.315 (5.315)  Loss 1.2957 (1.2957)    Prec@1 43.750 (43.750)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [9][0/4], lr: 0.00100    Time 3.891 (3.891)  Data 2.501 (2.501)  Loss 1.2222 (1.2222)    Prec@1 43.750 (43.750)  Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Test: [0/1] Time 1.638 (1.638)  Loss 1.5057 (1.5057)    Prec@1 31.250 (31.250)  Prec@5 100.000 (100.000)
Testing Results: Prec@1 31.250 Prec@5 100.000 Loss 1.50569

Best Prec@1: 31.250
Freezing BatchNorm2D except the first one.

After completing my training, I'm getting same result for every input video (accuracy, labels are always same). This is the result I got for each and every input video

0.328 -> normal
0.324 -> shoplifting
0.161 -> stealing
0.148 -> robbery
0.039 -> burglary

I have some clarifications 1) am I following right process? 2) In training epoch what is the meaning of Epoch: [5][0/4] also In my training [0/4] not increasing till the end. But in your training, I see the following

Epoch: [993][0/64], lr: 0.00100 Time 3.399 (3.399)  Data 3.113 (3.113)  Loss 1.8708 (1.8708)    Prec@1 25.000 (25.000)  Prec@5 100.000 (100.000)
Epoch: [993][20/64], lr: 0.00100    Time 0.179 (0.336)  Data 0.000 (0.148)  Loss 2.1559 (1.9719)    Prec@1 12.500 (12.500)  Prec@5 37.500 (72.619)
Epoch: [993][40/64], lr: 0.00100    Time 0.179 (0.260)  Data 0.000 (0.076)  Loss 2.0889 (1.9944)    Prec@1 0.000 (13.110)   Prec@5 50.000 (68.902)

Also myPrec@5 is always Prec@5 100.000 (100.000)

Is this because of I'm using colab? for training ??, the reason to ask is, the colab training stops in 119 steps(close to an hour training only), I suspect this is the issue, since I couldn't continue the training for more than hour, Do I have any place in the code to configure the training time?

Do I need to use 1080 TI or AWS for continuous training of atleast 12 hours?

highway007 commented 5 years ago

@Malathi15 Here are something I could tell you: 1.The Epoch: [5][0/4] is ok ,it's depend on your batch_size. 2.Prec@5 means whether the right one in the top5,you have 5 class,so your Prec@5 is always 100 3.I think colab or AWS isn't the reason for the bad results. 4.You can change the train times to control training time.(see opts.py --epochs)

I don't know how to solve it also...sry

Malathi15 commented 5 years ago

Hi @highway007 , Did you resolve this issue? can you please suggest me any useful links to train/test Custom Action Recognition using datasets like UCF-crime or moments Thanks

highway007 commented 5 years ago

@Malathi15 No,I don't. :( You can check my star where I saved some good repos about Action Recognition.

Malathi15 commented 5 years ago

@highway007 Thank you :)

zhoubolei / TRN-pytorch

Training seems doesn't work #46