openvinotoolkit / training_extensions

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
https://openvinotoolkit.github.io/training_extensions/
Apache License 2.0
1.14k stars 443 forks source link

pretrained action recognition model can not get the score 93.44%,where am I wrong? #93

Closed Xiehuaiqi closed 5 years ago

Xiehuaiqi commented 5 years ago

I want to test the score of model with the command ‘python3 main.py --dataset ucf101_1 --model se-resnext101-32x4d_vtn_rgbdiff -b128 --lr 1e-5 --seq 16 --st 2 --no-mean-norm --no-std-norm --no-train --no-val --test --pretrain-path ../light_model/se_resnext_101_32x4d_vtn_rgbd_ucf101_s1.pth ’,but I only get the score of 84% ,how can I get the score as the models tell 93.44%``

AlexanderDokuchaev commented 5 years ago

@vadimadr please take a look

vadimadr commented 5 years ago

Hi @Xiehuaiqi, I double-checked the uploaded model and got the correct score:

* Testing results: clip 0.9093  video 0.9344

Please make sure that you provide the "--no-mean-norm" and "--no-std-norm" options, because when I do not provide them I get a score similar to yours: video 0.8305

Can you please share full "output.log" file if the issue still reproduces.

Xiehuaiqi commented 5 years ago

@vadimadr the output.log is as below,I can not know why,Thanks a lot for your help ARGV:------------------------------------------------------------------------------- ['main.py', '--dataset', 'ucf101_1', '--model', 'se-resnext101-32x4d_vtn_rgbdiff', '-b128', '--lr', '1e-5', '--seq', '16', '--st', '2', '--no-mean-norm', '--no-std-norm', '--no-train', '--no-val', '--test', '--pretrain-path', '../light_model/se_resnext_101_32x4d_vtn_rgbd_ucf101_s1.pth']

CONFIG:

{'annotation_path': '../data/ucf101_01.json', 'arch': 'se-resnext101-32x4d_vtn_rgbdiff', 'batch_size': 128, 'begin_epoch': 1, 'bidirectional_lstm': False, 'checkpoint': 5, 'crop_position_in_test': 'c', 'cuda': True, 'dampening': 0.9, 'dataset': 'ucf101_1', 'dataset_config': None, 'device': device(type='cuda'), 'drop_last': True, 'encoder': 'resnet34', 'flow_path': None, 'fp16': False, 'gamma': 0.1, 'gradient_clipping': None, 'hflip': True, 'hidden_size': 512, 'initial_scale': 1.0, 'iter_size': 1, 'layer_norm': True, 'learning_rate': 1e-05, 'logger': <action_recognition.logging.TrainingLogger object at 0x7f324ca80438>, 'lr_patience': 10, 'lr_step_size': 10, 'manual_seed': 1, 'mean_dataset': 'imagenet', 'mean_norm': False, 'model': 'se-resnext101-32x4d_vtn_rgbdiff', 'model_depth': 18, 'momentum': 0.9, 'motion_path': None, 'n_classes': 101, 'n_epochs': 200, 'n_finetune_classes': None, 'n_scales': 4, 'n_test_clips': 10, 'n_threads': 4, 'n_val_clips': 3, 'nesterov': True, 'norm_value': 255, 'onnx': None, 'optimizer': 'adam', 'pretrain_path': PosixPath('../light_model/se_resnext_101_32x4d_vtn_rgbd_ucf101_s1.pth'), 'resnet_shortcut': 'B', 'resnext_cardinality': 32, 'result_path': PosixPath('results/31'), 'resume_path': None, 'resume_train': True, 'rgb_path': None, 'root_path': None, 'sample_duration': 16, 'sample_size': 224, 'scale_in_test': 1.0, 'scale_step': 0.8408964152537145, 'scales': [1.0, 0.8408964152537145, 0.7071067811865475, 0.5946035575013604], 'scheduler': 'plateau', 'softmax_in_test': False, 'split': 1, 'std_norm': False, 'sync_bn': False, 'teacher_checkpoint': None, 'teacher_model': None, 'tee': <action_recognition.utils.TeedStream object at 0x7f324ca91550>, 'temporal_stride': 2, 'test': True, 'test_subset': 'val', 'time_suffix': '25062208', 'train': False, 'try_resume': True, 'tta': False, 'val': False, 'video_format': 'frames', 'video_path': '../data/ucf101_videos/frames_data', 'weight_decay': 0.0001, 'weighted_sampling': False, 'wide_resnet_k': 2, 'writer': <tensorboardX.writer.SummaryWriter object at 0x7f324ca913c8>} Git branch: master Git rev: 01743cefcd390d5e461506f7cfbe4837df70e07c

loading pretrained model ../light_model/se_resnext_101_32x4d_vtn_rgbd_ucf101_s1.pth motion_decoder.resnet.0.conv1.weight -> trainable motion_decoder.resnet.0.bn1.weight -> trainable motion_decoder.resnet.0.bn1.bias -> trainable motion_decoder.resnet.1.0.conv1.weight -> trainable motion_decoder.resnet.1.0.bn1.weight -> trainable motion_decoder.resnet.1.0.bn1.bias -> trainable motion_decoder.resnet.1.0.conv2.weight -> trainable motion_decoder.resnet.1.0.bn2.weight -> trainable motion_decoder.resnet.1.0.bn2.bias -> trainable motion_decoder.resnet.1.0.conv3.weight -> trainable motion_decoder.resnet.1.0.bn3.weight -> trainable motion_decoder.resnet.1.0.bn3.bias -> trainable motion_decoder.resnet.1.0.se_module.fc1.weight -> trainable motion_decoder.resnet.1.0.se_module.fc1.bias -> trainable motion_decoder.resnet.1.0.se_module.fc2.weight -> trainable motion_decoder.resnet.1.0.se_module.fc2.bias -> trainable motion_decoder.resnet.1.0.downsample.0.weight -> trainable motion_decoder.resnet.1.0.downsample.1.weight -> trainable motion_decoder.resnet.1.0.downsample.1.bias -> trainable motion_decoder.resnet.1.1.conv1.weight -> trainable motion_decoder.resnet.1.1.bn1.weight -> trainable motion_decoder.resnet.1.1.bn1.bias -> trainable motion_decoder.resnet.1.1.conv2.weight -> trainable motion_decoder.resnet.1.1.bn2.weight -> trainable motion_decoder.resnet.1.1.bn2.bias -> trainable motion_decoder.resnet.1.1.conv3.weight -> trainable motion_decoder.resnet.1.1.bn3.weight -> trainable motion_decoder.resnet.1.1.bn3.bias -> trainable motion_decoder.resnet.1.1.se_module.fc1.weight -> trainable motion_decoder.resnet.1.1.se_module.fc1.bias -> trainable motion_decoder.resnet.1.1.se_module.fc2.weight -> trainable motion_decoder.resnet.1.1.se_module.fc2.bias -> trainable motion_decoder.resnet.1.2.conv1.weight -> trainable motion_decoder.resnet.1.2.bn1.weight -> trainable motion_decoder.resnet.1.2.bn1.bias -> trainable motion_decoder.resnet.1.2.conv2.weight -> trainable motion_decoder.resnet.1.2.bn2.weight -> trainable motion_decoder.resnet.1.2.bn2.bias -> trainable motion_decoder.resnet.1.2.conv3.weight -> trainable motion_decoder.resnet.1.2.bn3.weight -> trainable motion_decoder.resnet.1.2.bn3.bias -> trainable motion_decoder.resnet.1.2.se_module.fc1.weight -> trainable motion_decoder.resnet.1.2.se_module.fc1.bias -> trainable motion_decoder.resnet.1.2.se_module.fc2.weight -> trainable motion_decoder.resnet.1.2.se_module.fc2.bias -> trainable motion_decoder.resnet.2.0.conv1.weight -> trainable motion_decoder.resnet.2.0.bn1.weight -> trainable motion_decoder.resnet.2.0.bn1.bias -> trainable motion_decoder.resnet.2.0.conv2.weight -> trainable motion_decoder.resnet.2.0.bn2.weight -> trainable motion_decoder.resnet.2.0.bn2.bias -> trainable motion_decoder.resnet.2.0.conv3.weight -> trainable motion_decoder.resnet.2.0.bn3.weight -> trainable motion_decoder.resnet.2.0.bn3.bias -> trainable motion_decoder.resnet.2.0.se_module.fc1.weight -> trainable motion_decoder.resnet.2.0.se_module.fc1.bias -> trainable motion_decoder.resnet.2.0.se_module.fc2.weight -> trainable motion_decoder.resnet.2.0.se_module.fc2.bias -> trainable motion_decoder.resnet.2.0.downsample.0.weight -> trainable motion_decoder.resnet.2.0.downsample.1.weight -> trainable motion_decoder.resnet.2.0.downsample.1.bias -> trainable motion_decoder.resnet.2.1.conv1.weight -> trainable motion_decoder.resnet.2.1.bn1.weight -> trainable motion_decoder.resnet.2.1.bn1.bias -> trainable motion_decoder.resnet.2.1.conv2.weight -> trainable motion_decoder.resnet.2.1.bn2.weight -> trainable motion_decoder.resnet.2.1.bn2.bias -> trainable motion_decoder.resnet.2.1.conv3.weight -> trainable motion_decoder.resnet.2.1.bn3.weight -> trainable motion_decoder.resnet.2.1.bn3.bias -> trainable motion_decoder.resnet.2.1.se_module.fc1.weight -> trainable motion_decoder.resnet.2.1.se_module.fc1.bias -> trainable motion_decoder.resnet.2.1.se_module.fc2.weight -> trainable motion_decoder.resnet.2.1.se_module.fc2.bias -> trainable motion_decoder.resnet.2.2.conv1.weight -> trainable motion_decoder.resnet.2.2.bn1.weight -> trainable motion_decoder.resnet.2.2.bn1.bias -> trainable motion_decoder.resnet.2.2.conv2.weight -> trainable motion_decoder.resnet.2.2.bn2.weight -> trainable motion_decoder.resnet.2.2.bn2.bias -> trainable motion_decoder.resnet.2.2.conv3.weight -> trainable motion_decoder.resnet.2.2.bn3.weight -> trainable motion_decoder.resnet.2.2.bn3.bias -> trainable motion_decoder.resnet.2.2.se_module.fc1.weight -> trainable motion_decoder.resnet.2.2.se_module.fc1.bias -> trainable motion_decoder.resnet.2.2.se_module.fc2.weight -> trainable motion_decoder.resnet.2.2.se_module.fc2.bias -> trainable motion_decoder.resnet.2.3.conv1.weight -> trainable motion_decoder.resnet.2.3.bn1.weight -> trainable motion_decoder.resnet.2.3.bn1.bias -> trainable motion_decoder.resnet.2.3.conv2.weight -> trainable motion_decoder.resnet.2.3.bn2.weight -> trainable motion_decoder.resnet.2.3.bn2.bias -> trainable motion_decoder.resnet.2.3.conv3.weight -> trainable motion_decoder.resnet.2.3.bn3.weight -> trainable motion_decoder.resnet.2.3.bn3.bias -> trainable motion_decoder.resnet.2.3.se_module.fc1.weight -> trainable motion_decoder.resnet.2.3.se_module.fc1.bias -> trainable motion_decoder.resnet.2.3.se_module.fc2.weight -> trainable motion_decoder.resnet.2.3.se_module.fc2.bias -> trainable motion_decoder.resnet.3.0.conv1.weight -> trainable motion_decoder.resnet.3.0.bn1.weight -> trainable motion_decoder.resnet.3.0.bn1.bias -> trainable motion_decoder.resnet.3.0.conv2.weight -> trainable motion_decoder.resnet.3.0.bn2.weight -> trainable motion_decoder.resnet.3.0.bn2.bias -> trainable motion_decoder.resnet.3.0.conv3.weight -> trainable motion_decoder.resnet.3.0.bn3.weight -> trainable motion_decoder.resnet.3.0.bn3.bias -> trainable motion_decoder.resnet.3.0.se_module.fc1.weight -> trainable motion_decoder.resnet.3.0.se_module.fc1.bias -> trainable motion_decoder.resnet.3.0.se_module.fc2.weight -> trainable motion_decoder.resnet.3.0.se_module.fc2.bias -> trainable motion_decoder.resnet.3.0.downsample.0.weight -> trainable motion_decoder.resnet.3.0.downsample.1.weight -> trainable motion_decoder.resnet.3.0.downsample.1.bias -> trainable motion_decoder.resnet.3.1.conv1.weight -> trainable motion_decoder.resnet.3.1.bn1.weight -> trainable motion_decoder.resnet.3.1.bn1.bias -> trainable motion_decoder.resnet.3.1.conv2.weight -> trainable motion_decoder.resnet.3.1.bn2.weight -> trainable motion_decoder.resnet.3.1.bn2.bias -> trainable motion_decoder.resnet.3.1.conv3.weight -> trainable motion_decoder.resnet.3.1.bn3.weight -> trainable motion_decoder.resnet.3.1.bn3.bias -> trainable motion_decoder.resnet.3.1.se_module.fc1.weight -> trainable motion_decoder.resnet.3.1.se_module.fc1.bias -> trainable motion_decoder.resnet.3.1.se_module.fc2.weight -> trainable motion_decoder.resnet.3.1.se_module.fc2.bias -> trainable motion_decoder.resnet.3.2.conv1.weight -> trainable motion_decoder.resnet.3.2.bn1.weight -> trainable motion_decoder.resnet.3.2.bn1.bias -> trainable motion_decoder.resnet.3.2.conv2.weight -> trainable motion_decoder.resnet.3.2.bn2.weight -> trainable motion_decoder.resnet.3.2.bn2.bias -> trainable motion_decoder.resnet.3.2.conv3.weight -> trainable motion_decoder.resnet.3.2.bn3.weight -> trainable motion_decoder.resnet.3.2.bn3.bias -> trainable motion_decoder.resnet.3.2.se_module.fc1.weight -> trainable motion_decoder.resnet.3.2.se_module.fc1.bias -> trainable motion_decoder.resnet.3.2.se_module.fc2.weight -> trainable motion_decoder.resnet.3.2.se_module.fc2.bias -> trainable motion_decoder.resnet.3.3.conv1.weight -> trainable motion_decoder.resnet.3.3.bn1.weight -> trainable motion_decoder.resnet.3.3.bn1.bias -> trainable motion_decoder.resnet.3.3.conv2.weight -> trainable motion_decoder.resnet.3.3.bn2.weight -> trainable motion_decoder.resnet.3.3.bn2.bias -> trainable motion_decoder.resnet.3.3.conv3.weight -> trainable motion_decoder.resnet.3.3.bn3.weight -> trainable motion_decoder.resnet.3.3.bn3.bias -> trainable motion_decoder.resnet.3.3.se_module.fc1.weight -> trainable motion_decoder.resnet.3.3.se_module.fc1.bias -> trainable motion_decoder.resnet.3.3.se_module.fc2.weight -> trainable motion_decoder.resnet.3.3.se_module.fc2.bias -> trainable motion_decoder.resnet.3.4.conv1.weight -> trainable motion_decoder.resnet.3.4.bn1.weight -> trainable motion_decoder.resnet.3.4.bn1.bias -> trainable motion_decoder.resnet.3.4.conv2.weight -> trainable motion_decoder.resnet.3.4.bn2.weight -> trainable motion_decoder.resnet.3.4.bn2.bias -> trainable motion_decoder.resnet.3.4.conv3.weight -> trainable motion_decoder.resnet.3.4.bn3.weight -> trainable motion_decoder.resnet.3.4.bn3.bias -> trainable motion_decoder.resnet.3.4.se_module.fc1.weight -> trainable motion_decoder.resnet.3.4.se_module.fc1.bias -> trainable motion_decoder.resnet.3.4.se_module.fc2.weight -> trainable motion_decoder.resnet.3.4.se_module.fc2.bias -> trainable motion_decoder.resnet.3.5.conv1.weight -> trainable motion_decoder.resnet.3.5.bn1.weight -> trainable motion_decoder.resnet.3.5.bn1.bias -> trainable motion_decoder.resnet.3.5.conv2.weight -> trainable motion_decoder.resnet.3.5.bn2.weight -> trainable motion_decoder.resnet.3.5.bn2.bias -> trainable motion_decoder.resnet.3.5.conv3.weight -> trainable motion_decoder.resnet.3.5.bn3.weight -> trainable motion_decoder.resnet.3.5.bn3.bias -> trainable motion_decoder.resnet.3.5.se_module.fc1.weight -> trainable motion_decoder.resnet.3.5.se_module.fc1.bias -> trainable motion_decoder.resnet.3.5.se_module.fc2.weight -> trainable motion_decoder.resnet.3.5.se_module.fc2.bias -> trainable motion_decoder.resnet.3.6.conv1.weight -> trainable motion_decoder.resnet.3.6.bn1.weight -> trainable motion_decoder.resnet.3.6.bn1.bias -> trainable motion_decoder.resnet.3.6.conv2.weight -> trainable motion_decoder.resnet.3.6.bn2.weight -> trainable motion_decoder.resnet.3.6.bn2.bias -> trainable motion_decoder.resnet.3.6.conv3.weight -> trainable motion_decoder.resnet.3.6.bn3.weight -> trainable motion_decoder.resnet.3.6.bn3.bias -> trainable motion_decoder.resnet.3.6.se_module.fc1.weight -> trainable motion_decoder.resnet.3.6.se_module.fc1.bias -> trainable motion_decoder.resnet.3.6.se_module.fc2.weight -> trainable motion_decoder.resnet.3.6.se_module.fc2.bias -> trainable motion_decoder.resnet.3.7.conv1.weight -> trainable motion_decoder.resnet.3.7.bn1.weight -> trainable motion_decoder.resnet.3.7.bn1.bias -> trainable motion_decoder.resnet.3.7.conv2.weight -> trainable motion_decoder.resnet.3.7.bn2.weight -> trainable motion_decoder.resnet.3.7.bn2.bias -> trainable motion_decoder.resnet.3.7.conv3.weight -> trainable motion_decoder.resnet.3.7.bn3.weight -> trainable motion_decoder.resnet.3.7.bn3.bias -> trainable motion_decoder.resnet.3.7.se_module.fc1.weight -> trainable motion_decoder.resnet.3.7.se_module.fc1.bias -> trainable motion_decoder.resnet.3.7.se_module.fc2.weight -> trainable motion_decoder.resnet.3.7.se_module.fc2.bias -> trainable motion_decoder.resnet.3.8.conv1.weight -> trainable motion_decoder.resnet.3.8.bn1.weight -> trainable motion_decoder.resnet.3.8.bn1.bias -> trainable motion_decoder.resnet.3.8.conv2.weight -> trainable motion_decoder.resnet.3.8.bn2.weight -> trainable motion_decoder.resnet.3.8.bn2.bias -> trainable motion_decoder.resnet.3.8.conv3.weight -> trainable motion_decoder.resnet.3.8.bn3.weight -> trainable motion_decoder.resnet.3.8.bn3.bias -> trainable motion_decoder.resnet.3.8.se_module.fc1.weight -> trainable motion_decoder.resnet.3.8.se_module.fc1.bias -> trainable motion_decoder.resnet.3.8.se_module.fc2.weight -> trainable motion_decoder.resnet.3.8.se_module.fc2.bias -> trainable motion_decoder.resnet.3.9.conv1.weight -> trainable motion_decoder.resnet.3.9.bn1.weight -> trainable motion_decoder.resnet.3.9.bn1.bias -> trainable motion_decoder.resnet.3.9.conv2.weight -> trainable motion_decoder.resnet.3.9.bn2.weight -> trainable motion_decoder.resnet.3.9.bn2.bias -> trainable motion_decoder.resnet.3.9.conv3.weight -> trainable motion_decoder.resnet.3.9.bn3.weight -> trainable motion_decoder.resnet.3.9.bn3.bias -> trainable motion_decoder.resnet.3.9.se_module.fc1.weight -> trainable motion_decoder.resnet.3.9.se_module.fc1.bias -> trainable motion_decoder.resnet.3.9.se_module.fc2.weight -> trainable motion_decoder.resnet.3.9.se_module.fc2.bias -> trainable motion_decoder.resnet.3.10.conv1.weight -> trainable motion_decoder.resnet.3.10.bn1.weight -> trainable motion_decoder.resnet.3.10.bn1.bias -> trainable motion_decoder.resnet.3.10.conv2.weight -> trainable motion_decoder.resnet.3.10.bn2.weight -> trainable motion_decoder.resnet.3.10.bn2.bias -> trainable motion_decoder.resnet.3.10.conv3.weight -> trainable motion_decoder.resnet.3.10.bn3.weight -> trainable motion_decoder.resnet.3.10.bn3.bias -> trainable motion_decoder.resnet.3.10.se_module.fc1.weight -> trainable motion_decoder.resnet.3.10.se_module.fc1.bias -> trainable motion_decoder.resnet.3.10.se_module.fc2.weight -> trainable motion_decoder.resnet.3.10.se_module.fc2.bias -> trainable motion_decoder.resnet.3.11.conv1.weight -> trainable motion_decoder.resnet.3.11.bn1.weight -> trainable motion_decoder.resnet.3.11.bn1.bias -> trainable motion_decoder.resnet.3.11.conv2.weight -> trainable motion_decoder.resnet.3.11.bn2.weight -> trainable motion_decoder.resnet.3.11.bn2.bias -> trainable motion_decoder.resnet.3.11.conv3.weight -> trainable motion_decoder.resnet.3.11.bn3.weight -> trainable motion_decoder.resnet.3.11.bn3.bias -> trainable motion_decoder.resnet.3.11.se_module.fc1.weight -> trainable motion_decoder.resnet.3.11.se_module.fc1.bias -> trainable motion_decoder.resnet.3.11.se_module.fc2.weight -> trainable motion_decoder.resnet.3.11.se_module.fc2.bias -> trainable motion_decoder.resnet.3.12.conv1.weight -> trainable motion_decoder.resnet.3.12.bn1.weight -> trainable motion_decoder.resnet.3.12.bn1.bias -> trainable motion_decoder.resnet.3.12.conv2.weight -> trainable motion_decoder.resnet.3.12.bn2.weight -> trainable motion_decoder.resnet.3.12.bn2.bias -> trainable motion_decoder.resnet.3.12.conv3.weight -> trainable motion_decoder.resnet.3.12.bn3.weight -> trainable motion_decoder.resnet.3.12.bn3.bias -> trainable motion_decoder.resnet.3.12.se_module.fc1.weight -> trainable motion_decoder.resnet.3.12.se_module.fc1.bias -> trainable motion_decoder.resnet.3.12.se_module.fc2.weight -> trainable motion_decoder.resnet.3.12.se_module.fc2.bias -> trainable motion_decoder.resnet.3.13.conv1.weight -> trainable motion_decoder.resnet.3.13.bn1.weight -> trainable motion_decoder.resnet.3.13.bn1.bias -> trainable motion_decoder.resnet.3.13.conv2.weight -> trainable motion_decoder.resnet.3.13.bn2.weight -> trainable motion_decoder.resnet.3.13.bn2.bias -> trainable motion_decoder.resnet.3.13.conv3.weight -> trainable motion_decoder.resnet.3.13.bn3.weight -> trainable motion_decoder.resnet.3.13.bn3.bias -> trainable motion_decoder.resnet.3.13.se_module.fc1.weight -> trainable motion_decoder.resnet.3.13.se_module.fc1.bias -> trainable motion_decoder.resnet.3.13.se_module.fc2.weight -> trainable motion_decoder.resnet.3.13.se_module.fc2.bias -> trainable motion_decoder.resnet.3.14.conv1.weight -> trainable motion_decoder.resnet.3.14.bn1.weight -> trainable motion_decoder.resnet.3.14.bn1.bias -> trainable motion_decoder.resnet.3.14.conv2.weight -> trainable motion_decoder.resnet.3.14.bn2.weight -> trainable motion_decoder.resnet.3.14.bn2.bias -> trainable motion_decoder.resnet.3.14.conv3.weight -> trainable motion_decoder.resnet.3.14.bn3.weight -> trainable motion_decoder.resnet.3.14.bn3.bias -> trainable motion_decoder.resnet.3.14.se_module.fc1.weight -> trainable motion_decoder.resnet.3.14.se_module.fc1.bias -> trainable motion_decoder.resnet.3.14.se_module.fc2.weight -> trainable motion_decoder.resnet.3.14.se_module.fc2.bias -> trainable motion_decoder.resnet.3.15.conv1.weight -> trainable motion_decoder.resnet.3.15.bn1.weight -> trainable motion_decoder.resnet.3.15.bn1.bias -> trainable motion_decoder.resnet.3.15.conv2.weight -> trainable motion_decoder.resnet.3.15.bn2.weight -> trainable motion_decoder.resnet.3.15.bn2.bias -> trainable motion_decoder.resnet.3.15.conv3.weight -> trainable motion_decoder.resnet.3.15.bn3.weight -> trainable motion_decoder.resnet.3.15.bn3.bias -> trainable motion_decoder.resnet.3.15.se_module.fc1.weight -> trainable motion_decoder.resnet.3.15.se_module.fc1.bias -> trainable motion_decoder.resnet.3.15.se_module.fc2.weight -> trainable motion_decoder.resnet.3.15.se_module.fc2.bias -> trainable motion_decoder.resnet.3.16.conv1.weight -> trainable motion_decoder.resnet.3.16.bn1.weight -> trainable motion_decoder.resnet.3.16.bn1.bias -> trainable motion_decoder.resnet.3.16.conv2.weight -> trainable motion_decoder.resnet.3.16.bn2.weight -> trainable motion_decoder.resnet.3.16.bn2.bias -> trainable motion_decoder.resnet.3.16.conv3.weight -> trainable motion_decoder.resnet.3.16.bn3.weight -> trainable motion_decoder.resnet.3.16.bn3.bias -> trainable motion_decoder.resnet.3.16.se_module.fc1.weight -> trainable motion_decoder.resnet.3.16.se_module.fc1.bias -> trainable motion_decoder.resnet.3.16.se_module.fc2.weight -> trainable motion_decoder.resnet.3.16.se_module.fc2.bias -> trainable motion_decoder.resnet.3.17.conv1.weight -> trainable motion_decoder.resnet.3.17.bn1.weight -> trainable motion_decoder.resnet.3.17.bn1.bias -> trainable motion_decoder.resnet.3.17.conv2.weight -> trainable motion_decoder.resnet.3.17.bn2.weight -> trainable motion_decoder.resnet.3.17.bn2.bias -> trainable motion_decoder.resnet.3.17.conv3.weight -> trainable motion_decoder.resnet.3.17.bn3.weight -> trainable motion_decoder.resnet.3.17.bn3.bias -> trainable motion_decoder.resnet.3.17.se_module.fc1.weight -> trainable motion_decoder.resnet.3.17.se_module.fc1.bias -> trainable motion_decoder.resnet.3.17.se_module.fc2.weight -> trainable motion_decoder.resnet.3.17.se_module.fc2.bias -> trainable motion_decoder.resnet.3.18.conv1.weight -> trainable motion_decoder.resnet.3.18.bn1.weight -> trainable motion_decoder.resnet.3.18.bn1.bias -> trainable motion_decoder.resnet.3.18.conv2.weight -> trainable motion_decoder.resnet.3.18.bn2.weight -> trainable motion_decoder.resnet.3.18.bn2.bias -> trainable motion_decoder.resnet.3.18.conv3.weight -> trainable motion_decoder.resnet.3.18.bn3.weight -> trainable motion_decoder.resnet.3.18.bn3.bias -> trainable motion_decoder.resnet.3.18.se_module.fc1.weight -> trainable motion_decoder.resnet.3.18.se_module.fc1.bias -> trainable motion_decoder.resnet.3.18.se_module.fc2.weight -> trainable motion_decoder.resnet.3.18.se_module.fc2.bias -> trainable motion_decoder.resnet.3.19.conv1.weight -> trainable motion_decoder.resnet.3.19.bn1.weight -> trainable motion_decoder.resnet.3.19.bn1.bias -> trainable motion_decoder.resnet.3.19.conv2.weight -> trainable motion_decoder.resnet.3.19.bn2.weight -> trainable motion_decoder.resnet.3.19.bn2.bias -> trainable motion_decoder.resnet.3.19.conv3.weight -> trainable motion_decoder.resnet.3.19.bn3.weight -> trainable motion_decoder.resnet.3.19.bn3.bias -> trainable motion_decoder.resnet.3.19.se_module.fc1.weight -> trainable motion_decoder.resnet.3.19.se_module.fc1.bias -> trainable motion_decoder.resnet.3.19.se_module.fc2.weight -> trainable motion_decoder.resnet.3.19.se_module.fc2.bias -> trainable motion_decoder.resnet.3.20.conv1.weight -> trainable motion_decoder.resnet.3.20.bn1.weight -> trainable motion_decoder.resnet.3.20.bn1.bias -> trainable motion_decoder.resnet.3.20.conv2.weight -> trainable motion_decoder.resnet.3.20.bn2.weight -> trainable motion_decoder.resnet.3.20.bn2.bias -> trainable motion_decoder.resnet.3.20.conv3.weight -> trainable motion_decoder.resnet.3.20.bn3.weight -> trainable motion_decoder.resnet.3.20.bn3.bias -> trainable motion_decoder.resnet.3.20.se_module.fc1.weight -> trainable motion_decoder.resnet.3.20.se_module.fc1.bias -> trainable motion_decoder.resnet.3.20.se_module.fc2.weight -> trainable motion_decoder.resnet.3.20.se_module.fc2.bias -> trainable motion_decoder.resnet.3.21.conv1.weight -> trainable motion_decoder.resnet.3.21.bn1.weight -> trainable motion_decoder.resnet.3.21.bn1.bias -> trainable motion_decoder.resnet.3.21.conv2.weight -> trainable motion_decoder.resnet.3.21.bn2.weight -> trainable motion_decoder.resnet.3.21.bn2.bias -> trainable motion_decoder.resnet.3.21.conv3.weight -> trainable motion_decoder.resnet.3.21.bn3.weight -> trainable motion_decoder.resnet.3.21.bn3.bias -> trainable motion_decoder.resnet.3.21.se_module.fc1.weight -> trainable motion_decoder.resnet.3.21.se_module.fc1.bias -> trainable motion_decoder.resnet.3.21.se_module.fc2.weight -> trainable motion_decoder.resnet.3.21.se_module.fc2.bias -> trainable motion_decoder.resnet.3.22.conv1.weight -> trainable motion_decoder.resnet.3.22.bn1.weight -> trainable motion_decoder.resnet.3.22.bn1.bias -> trainable motion_decoder.resnet.3.22.conv2.weight -> trainable motion_decoder.resnet.3.22.bn2.weight -> trainable motion_decoder.resnet.3.22.bn2.bias -> trainable motion_decoder.resnet.3.22.conv3.weight -> trainable motion_decoder.resnet.3.22.bn3.weight -> trainable motion_decoder.resnet.3.22.bn3.bias -> trainable motion_decoder.resnet.3.22.se_module.fc1.weight -> trainable motion_decoder.resnet.3.22.se_module.fc1.bias -> trainable motion_decoder.resnet.3.22.se_module.fc2.weight -> trainable motion_decoder.resnet.3.22.se_module.fc2.bias -> trainable motion_decoder.resnet.4.0.conv1.weight -> trainable motion_decoder.resnet.4.0.bn1.weight -> trainable motion_decoder.resnet.4.0.bn1.bias -> trainable motion_decoder.resnet.4.0.conv2.weight -> trainable motion_decoder.resnet.4.0.bn2.weight -> trainable motion_decoder.resnet.4.0.bn2.bias -> trainable motion_decoder.resnet.4.0.conv3.weight -> trainable motion_decoder.resnet.4.0.bn3.weight -> trainable motion_decoder.resnet.4.0.bn3.bias -> trainable motion_decoder.resnet.4.0.se_module.fc1.weight -> trainable motion_decoder.resnet.4.0.se_module.fc1.bias -> trainable motion_decoder.resnet.4.0.se_module.fc2.weight -> trainable motion_decoder.resnet.4.0.se_module.fc2.bias -> trainable motion_decoder.resnet.4.0.downsample.0.weight -> trainable motion_decoder.resnet.4.0.downsample.1.weight -> trainable motion_decoder.resnet.4.0.downsample.1.bias -> trainable motion_decoder.resnet.4.1.conv1.weight -> trainable motion_decoder.resnet.4.1.bn1.weight -> trainable motion_decoder.resnet.4.1.bn1.bias -> trainable motion_decoder.resnet.4.1.conv2.weight -> trainable motion_decoder.resnet.4.1.bn2.weight -> trainable motion_decoder.resnet.4.1.bn2.bias -> trainable motion_decoder.resnet.4.1.conv3.weight -> trainable motion_decoder.resnet.4.1.bn3.weight -> trainable motion_decoder.resnet.4.1.bn3.bias -> trainable motion_decoder.resnet.4.1.se_module.fc1.weight -> trainable motion_decoder.resnet.4.1.se_module.fc1.bias -> trainable motion_decoder.resnet.4.1.se_module.fc2.weight -> trainable motion_decoder.resnet.4.1.se_module.fc2.bias -> trainable motion_decoder.resnet.4.2.conv1.weight -> trainable motion_decoder.resnet.4.2.bn1.weight -> trainable motion_decoder.resnet.4.2.bn1.bias -> trainable motion_decoder.resnet.4.2.conv2.weight -> trainable motion_decoder.resnet.4.2.bn2.weight -> trainable motion_decoder.resnet.4.2.bn2.bias -> trainable motion_decoder.resnet.4.2.conv3.weight -> trainable motion_decoder.resnet.4.2.bn3.weight -> trainable motion_decoder.resnet.4.2.bn3.bias -> trainable motion_decoder.resnet.4.2.se_module.fc1.weight -> trainable motion_decoder.resnet.4.2.se_module.fc1.bias -> trainable motion_decoder.resnet.4.2.se_module.fc2.weight -> trainable motion_decoder.resnet.4.2.se_module.fc2.bias -> trainable motion_decoder.reduce_conv.weight -> trainable motion_decoder.reduce_conv.bias -> trainable motion_decoder.self_attention_decoder.position_encoding.enc.weight -> trainable motion_decoder.self_attention_decoder.layers.0.slf_attn.w_qs -> trainable motion_decoder.self_attention_decoder.layers.0.slf_attn.w_ks -> trainable motion_decoder.self_attention_decoder.layers.0.slf_attn.w_vs -> trainable motion_decoder.self_attention_decoder.layers.0.slf_attn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.0.slf_attn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.0.pos_ffn.w_1.weight -> trainable motion_decoder.self_attention_decoder.layers.0.pos_ffn.w_1.bias -> trainable motion_decoder.self_attention_decoder.layers.0.pos_ffn.w_2.weight -> trainable motion_decoder.self_attention_decoder.layers.0.pos_ffn.w_2.bias -> trainable motion_decoder.self_attention_decoder.layers.0.pos_ffn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.0.pos_ffn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.1.slf_attn.w_qs -> trainable motion_decoder.self_attention_decoder.layers.1.slf_attn.w_ks -> trainable motion_decoder.self_attention_decoder.layers.1.slf_attn.w_vs -> trainable motion_decoder.self_attention_decoder.layers.1.slf_attn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.1.slf_attn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.1.pos_ffn.w_1.weight -> trainable motion_decoder.self_attention_decoder.layers.1.pos_ffn.w_1.bias -> trainable motion_decoder.self_attention_decoder.layers.1.pos_ffn.w_2.weight -> trainable motion_decoder.self_attention_decoder.layers.1.pos_ffn.w_2.bias -> trainable motion_decoder.self_attention_decoder.layers.1.pos_ffn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.1.pos_ffn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.2.slf_attn.w_qs -> trainable motion_decoder.self_attention_decoder.layers.2.slf_attn.w_ks -> trainable motion_decoder.self_attention_decoder.layers.2.slf_attn.w_vs -> trainable motion_decoder.self_attention_decoder.layers.2.slf_attn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.2.slf_attn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.2.pos_ffn.w_1.weight -> trainable motion_decoder.self_attention_decoder.layers.2.pos_ffn.w_1.bias -> trainable motion_decoder.self_attention_decoder.layers.2.pos_ffn.w_2.weight -> trainable motion_decoder.self_attention_decoder.layers.2.pos_ffn.w_2.bias -> trainable motion_decoder.self_attention_decoder.layers.2.pos_ffn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.2.pos_ffn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.3.slf_attn.w_qs -> trainable motion_decoder.self_attention_decoder.layers.3.slf_attn.w_ks -> trainable motion_decoder.self_attention_decoder.layers.3.slf_attn.w_vs -> trainable motion_decoder.self_attention_decoder.layers.3.slf_attn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.3.slf_attn.layer_norm.b_2 -> trainable motion_decoder.self_attention_decoder.layers.3.pos_ffn.w_1.weight -> trainable motion_decoder.self_attention_decoder.layers.3.pos_ffn.w_1.bias -> trainable motion_decoder.self_attention_decoder.layers.3.pos_ffn.w_2.weight -> trainable motion_decoder.self_attention_decoder.layers.3.pos_ffn.w_2.bias -> trainable motion_decoder.self_attention_decoder.layers.3.pos_ffn.layer_norm.a_2 -> trainable motion_decoder.self_attention_decoder.layers.3.pos_ffn.layer_norm.b_2 -> trainable motion_decoder.fc.weight -> trainable motion_decoder.fc.bias -> trainable test /usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py:489: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. result = self.forward(*input, **kwargs) Testing: [1/296] clip 0.6406 (0.6406) video 0.5833 Testing: [2/296] clip 0.6562 (0.6484) video 0.6800 Testing: [3/296] clip 0.6953 (0.6641) video 0.7368 Testing: [4/296] clip 0.5781 (0.6426) video 0.6863 Testing: [5/296] clip 0.7578 (0.6656) video 0.7460 ^[^A^[^ATesting: [6/296] clip 0.7344 (0.6771) video 0.7895 Testing: [7/296] clip 0.8359 (0.6998) video 0.7978 Testing: [8/296] clip 0.7344 (0.7041) video 0.8235 Testing: [9/296] clip 0.8984 (0.7257) video 0.8435 Testing: [10/296] clip 0.9531 (0.7484) video 0.8504 Testing: [11/296] clip 0.9609 (0.7678) video 0.8643 Testing: [12/296] clip 0.7656 (0.7676) video 0.8758 Testing: [13/296] clip 0.7422 (0.7656) video 0.8735 Testing: [14/296] clip 0.5469 (0.7500) video 0.8547 Testing: [15/296] clip 0.8672 (0.7578) video 0.8639 Testing: [16/296] clip 0.8984 (0.7666) video 0.8676 Testing: [17/296] clip 0.9297 (0.7762) video 0.8756 Testing: [18/296] clip 0.6250 (0.7678) video 0.8565 Testing: [19/296] clip 0.7578 (0.7673) video 0.8560 Testing: [20/296] clip 0.8438 (0.7711) video 0.8627 Testing: [21/296] clip 0.9688 (0.7805) video 0.8694 Testing: [22/296] clip 1.0000 (0.7905) video 0.8754 Testing: [23/296] clip 0.9297 (0.7965) video 0.8776 Testing: [24/296] clip 0.7422 (0.7943) video 0.8730 Testing: [25/296] clip 0.8047 (0.7947) video 0.8746 Testing: [26/296] clip 0.8906 (0.7984) video 0.8765 Testing: [27/296] clip 0.9453 (0.8038) video 0.8812 Testing: [28/296] clip 1.0000 (0.8108) video 0.8855 Testing: [29/296] clip 1.0000 (0.8173) video 0.8895 Testing: [30/296] clip 0.5234 (0.8076) video 0.8773 Testing: [31/296] clip 1.0000 (0.8138) video 0.8813 Testing: [32/296] clip 0.8281 (0.8142) video 0.8851 Testing: [33/296] clip 1.0000 (0.8198) video 0.8886 Testing: [34/296] clip 0.9609 (0.8240) video 0.8920 Testing: [35/296] clip 0.9531 (0.8277) video 0.8949 Testing: [36/296] clip 1.0000 (0.8325) video 0.8978 Testing: [37/296] clip 0.9531 (0.8357) video 0.9006 Testing: [38/296] clip 0.4922 (0.8267) video 0.8930 Testing: [39/296] clip 0.7422 (0.8245) video 0.8938 Testing: [40/296] clip 0.7969 (0.8238) video 0.8943 Testing: [41/296] clip 0.9844 (0.8277) video 0.8969 Testing: [42/296] clip 1.0000 (0.8318) video 0.8994 Testing: [43/296] clip 0.8047 (0.8312) video 0.9000 Testing: [44/296] clip 0.5000 (0.8237) video 0.8917 Testing: [45/296] clip 0.9141 (0.8257) video 0.8922 Testing: [46/296] clip 0.9141 (0.8276) video 0.8946 Testing: [47/296] clip 0.4141 (0.8188) video 0.8869 Testing: [48/296] clip 0.9219 (0.8210) video 0.8876 Testing: [49/296] clip 0.7188 (0.8189) video 0.8852 Testing: [50/296] clip 0.9375 (0.8213) video 0.8873 Testing: [51/296] clip 0.5234 (0.8154) video 0.8804 Testing: [52/296] clip 0.9844 (0.8187) video 0.8812 Testing: [53/296] clip 0.9766 (0.8216) video 0.8835 Testing: [54/296] clip 0.8984 (0.8231) video 0.8842 Testing: [55/296] clip 0.9922 (0.8261) video 0.8862 Testing: [56/296] clip 0.8672 (0.8269) video 0.8869 Testing: [57/296] clip 0.9219 (0.8285) video 0.8889 Testing: [58/296] clip 0.4609 (0.8222) video 0.8814 Testing: [59/296] clip 0.3906 (0.8149) video 0.8755 Testing: [60/296] clip 0.9375 (0.8169) video 0.8774 Testing: [61/296] clip 0.9453 (0.8190) video 0.8795 Testing: [62/296] clip 0.9922 (0.8218) video 0.8815 Testing: [63/296] clip 0.9766 (0.8243) video 0.8834 Testing: [64/296] clip 0.9844 (0.8268) video 0.8852 Testing: [65/296] clip 0.8906 (0.8278) video 0.8869 Testing: [66/296] clip 0.4922 (0.8227) video 0.8803 Testing: [67/296] clip 0.1562 (0.8127) video 0.8681 Testing: [68/296] clip 0.7266 (0.8115) video 0.8667 Testing: [69/296] clip 0.5156 (0.8072) video 0.8618 Testing: [70/296] clip 0.1250 (0.7974) video 0.8525 Testing: [71/296] clip 0.2109 (0.7892) video 0.8425 Testing: [72/296] clip 0.3281 (0.7828) video 0.8360 Testing: [73/296] clip 0.5938 (0.7802) video 0.8351 Testing: [74/296] clip 0.5781 (0.7774) video 0.8353 Testing: [75/296] clip 0.9297 (0.7795) video 0.8363 Testing: [76/296] clip 1.0000 (0.7824) video 0.8385 Testing: [77/296] clip 0.8906 (0.7838) video 0.8406 Testing: [78/296] clip 0.9531 (0.7860) video 0.8427 Testing: [79/296] clip 0.9922 (0.7886) video 0.8447 Testing: [80/296] clip 0.9844 (0.7910) video 0.8465 Testing: [81/296] clip 0.9453 (0.7929) video 0.8485 Testing: [82/296] clip 0.9922 (0.7954) video 0.8503 Testing: [83/296] clip 0.9375 (0.7971) video 0.8522 Testing: [84/296] clip 0.8047 (0.7972) video 0.8540 Testing: [85/296] clip 0.4766 (0.7934) video 0.8491 Testing: [86/296] clip 0.9375 (0.7951) video 0.8500 Testing: [87/296] clip 0.4141 (0.7907) video 0.8455 Testing: [88/296] clip 0.7969 (0.7907) video 0.8455 Testing: [89/296] clip 0.8125 (0.7910) video 0.8472 Testing: [90/296] clip 0.7109 (0.7901) video 0.8462 Testing: [91/296] clip 0.8750 (0.7910) video 0.8471 ^[^A^[^ATesting: [92/296] clip 1.0000 (0.7933) video 0.8488 Testing: [93/296] clip 0.8984 (0.7944) video 0.8496 Testing: [94/296] clip 0.5000 (0.7913) video 0.8454 Testing: [95/296] clip 0.1328 (0.7844) video 0.8387 Testing: [96/296] clip 0.7422 (0.7839) video 0.8371 Testing: [97/296] clip 0.8438 (0.7846) video 0.8380 Testing: [98/296] clip 0.8906 (0.7856) video 0.8397 Testing: [99/296] clip 0.7188 (0.7850) video 0.8398 Testing: [100/296] clip 0.2734 (0.7798) video 0.8350 Testing: [101/296] clip 0.5547 (0.7776) video 0.8359 Testing: [102/296] clip 0.6094 (0.7760) video 0.8368 Testing: [103/296] clip 0.8438 (0.7766) video 0.8384 Testing: [104/296] clip 0.6875 (0.7758) video 0.8370 Testing: [105/296] clip 0.7734 (0.7757) video 0.8377 Testing: [106/296] clip 0.3516 (0.7717) video 0.8333 Testing: [107/296] clip 0.1328 (0.7658) video 0.8262 Testing: [108/296] clip 0.5547 (0.7638) video 0.8242 Testing: [109/296] clip 0.3828 (0.7603) video 0.8208 Testing: [110/296] clip 0.5312 (0.7582) video 0.8195 Testing: [111/296] clip 0.6875 (0.7576) video 0.8183 Testing: [112/296] clip 0.6328 (0.7565) video 0.8186 Testing: [113/296] clip 0.6250 (0.7553) video 0.8181 Testing: [114/296] clip 0.9062 (0.7566) video 0.8197 Testing: [115/296] clip 0.8281 (0.7573) video 0.8212 Testing: [116/296] clip 0.8281 (0.7579) video 0.8228 Testing: [117/296] clip 0.8281 (0.7585) video 0.8236 Testing: [118/296] clip 0.9141 (0.7598) video 0.8252 Testing: [119/296] clip 1.0000 (0.7618) video 0.8267 Testing: [120/296] clip 1.0000 (0.7638) video 0.8280 Testing: [121/296] clip 0.9297 (0.7652) video 0.8288 Testing: [122/296] clip 1.0000 (0.7671) video 0.8302 Testing: [123/296] clip 1.0000 (0.7690) video 0.8316 Testing: [124/296] clip 1.0000 (0.7709) video 0.8330 Testing: [125/296] clip 0.9766 (0.7725) video 0.8343 Testing: [126/296] clip 0.9922 (0.7742) video 0.8356 Testing: [127/296] clip 0.9922 (0.7760) video 0.8369 Testing: [128/296] clip 0.9531 (0.7773) video 0.8382 Testing: [129/296] clip 0.9922 (0.7790) video 0.8395 Testing: [130/296] clip 0.9609 (0.7804) video 0.8406 Testing: [131/296] clip 0.9922 (0.7820) video 0.8419 Testing: [132/296] clip 0.6953 (0.7814) video 0.8413 Testing: [133/296] clip 0.7344 (0.7810) video 0.8408 Testing: [134/296] clip 0.5703 (0.7794) video 0.8391 Testing: [135/296] clip 1.0000 (0.7811) video 0.8402 Testing: [136/296] clip 0.9219 (0.7821) video 0.8414 Testing: [137/296] clip 0.9609 (0.7834) video 0.8426 Testing: [138/296] clip 0.9141 (0.7844) video 0.8431 Testing: [139/296] clip 0.5547 (0.7827) video 0.8409 Testing: [140/296] clip 0.8906 (0.7835) video 0.8420 Testing: [141/296] clip 1.0000 (0.7850) video 0.8431 Testing: [142/296] clip 0.9844 (0.7864) video 0.8442 Testing: [143/296] clip 0.8750 (0.7870) video 0.8448 Testing: [144/296] clip 0.7578 (0.7868) video 0.8437 Testing: [145/296] clip 0.9531 (0.7880) video 0.8447 Testing: [146/296] clip 0.9453 (0.7891) video 0.8458 Testing: [147/296] clip 0.8672 (0.7896) video 0.8458 Testing: [148/296] clip 1.0000 (0.7910) video 0.8469 Testing: [149/296] clip 0.8828 (0.7916) video 0.8479 Testing: [150/296] clip 0.2969 (0.7883) video 0.8447 Testing: [151/296] clip 0.4375 (0.7860) video 0.8427 Testing: [152/296] clip 0.8203 (0.7862) video 0.8437 Testing: [153/296] clip 0.6328 (0.7852) video 0.8432 Testing: [154/296] clip 0.6406 (0.7843) video 0.8422 Testing: [155/296] clip 0.7812 (0.7843) video 0.8422 Testing: [156/296] clip 0.9688 (0.7855) video 0.8432 Testing: [157/296] clip 0.7656 (0.7853) video 0.8437 Testing: [158/296] clip 0.9766 (0.7865) video 0.8447 Testing: [159/296] clip 0.8516 (0.7869) video 0.8447 Testing: [160/296] clip 0.5312 (0.7854) video 0.8432 Testing: [161/296] clip 0.7578 (0.7852) video 0.8437 Testing: [162/296] clip 0.9297 (0.7861) video 0.8447 Testing: [163/296] clip 0.4375 (0.7839) video 0.8418 Testing: [164/296] clip 0.0781 (0.7796) video 0.8371 Testing: [165/296] clip 0.9141 (0.7804) video 0.8375 Testing: [166/296] clip 0.8438 (0.7808) video 0.8385 Testing: [167/296] clip 0.5625 (0.7795) video 0.8367 Testing: [168/296] clip 0.8828 (0.7801) video 0.8377 Testing: [169/296] clip 0.8125 (0.7803) video 0.8377 Testing: [170/296] clip 0.7734 (0.7803) video 0.8377 Testing: [171/296] clip 0.8984 (0.7810) video 0.8387 Testing: [172/296] clip 0.3984 (0.7788) video 0.8369 Testing: [173/296] clip 0.7188 (0.7784) video 0.8369 Testing: [174/296] clip 0.9766 (0.7795) video 0.8379 Testing: [175/296] clip 1.0000 (0.7808) video 0.8388 Testing: [176/296] clip 0.8672 (0.7813) video 0.8393 Testing: [177/296] clip 0.9375 (0.7822) video 0.8402 Testing: [178/296] clip 1.0000 (0.7834) video 0.8411 Testing: [179/296] clip 0.7891 (0.7834) video 0.8416 Testing: [180/296] clip 0.5312 (0.7820) video 0.8411 Testing: [181/296] clip 0.8672 (0.7825) video 0.8415 Testing: [182/296] clip 1.0000 (0.7837) video 0.8424 Testing: [183/296] clip 0.9922 (0.7848) video 0.8433 Testing: [184/296] clip 0.9219 (0.7856) video 0.8442 Testing: [185/296] clip 0.8750 (0.7861) video 0.8450 Testing: [186/296] clip 0.9844 (0.7871) video 0.8458 Testing: [187/296] clip 0.5625 (0.7859) video 0.8445 Testing: [188/296] clip 0.9844 (0.7870) video 0.8454 Testing: [189/296] clip 0.9141 (0.7877) video 0.8458 Testing: [190/296] clip 0.5859 (0.7866) video 0.8457 Testing: [191/296] clip 0.5703 (0.7855) video 0.8445 Testing: [192/296] clip 0.4141 (0.7835) video 0.8429 Testing: [193/296] clip 0.9375 (0.7843) video 0.8437 Testing: [194/296] clip 0.9844 (0.7854) video 0.8445 Testing: [195/296] clip 0.9766 (0.7863) video 0.8453 Testing: [196/296] clip 0.9219 (0.7870) video 0.8457 Testing: [197/296] clip 1.0000 (0.7881) video 0.8465 Testing: [198/296] clip 0.9219 (0.7888) video 0.8473 Testing: [199/296] clip 0.9062 (0.7894) video 0.8477 Testing: [200/296] clip 0.8828 (0.7898) video 0.8484 Testing: [201/296] clip 0.9062 (0.7904) video 0.8491 Testing: [202/296] clip 0.6328 (0.7896) video 0.8499 Testing: [203/296] clip 1.0000 (0.7907) video 0.8507 Testing: [204/296] clip 0.9922 (0.7917) video 0.8514 Testing: [205/296] clip 0.4844 (0.7902) video 0.8494 Testing: [206/296] clip 0.6172 (0.7893) video 0.8483 Testing: [207/296] clip 0.7031 (0.7889) video 0.8486 Testing: [208/296] clip 0.6562 (0.7883) video 0.8475 Testing: [209/296] clip 1.0000 (0.7893) video 0.8482 Testing: [210/296] clip 1.0000 (0.7903) video 0.8489 Testing: [211/296] clip 1.0000 (0.7913) video 0.8496 Testing: [212/296] clip 0.7891 (0.7913) video 0.8496 Testing: [213/296] clip 0.6016 (0.7904) video 0.8485 Testing: [214/296] clip 0.9688 (0.7912) video 0.8492 Testing: [215/296] clip 0.5469 (0.7901) video 0.8488 Testing: [216/296] clip 0.8438 (0.7903) video 0.8488 Testing: [217/296] clip 0.7188 (0.7900) video 0.8484 Testing: [218/296] clip 0.9297 (0.7906) video 0.8491 Testing: [219/296] clip 0.9375 (0.7913) video 0.8498 Testing: [220/296] clip 0.9219 (0.7919) video 0.8504 Testing: [221/296] clip 0.8203 (0.7920) video 0.8511 Testing: [222/296] clip 0.5859 (0.7911) video 0.8504 Testing: [223/296] clip 0.4453 (0.7896) video 0.8486 Testing: [224/296] clip 0.8359 (0.7898) video 0.8493 Testing: [225/296] clip 0.9922 (0.7907) video 0.8499 Testing: [226/296] clip 0.9375 (0.7913) video 0.8506 Testing: [227/296] clip 0.8047 (0.7914) video 0.8506 Testing: [228/296] clip 0.4375 (0.7898) video 0.8489 Testing: [229/296] clip 0.4531 (0.7883) video 0.8475 Testing: [230/296] clip 0.0703 (0.7852) video 0.8440 Testing: [231/296] clip 0.0547 (0.7821) video 0.8403 Testing: [232/296] clip 0.2812 (0.7799) video 0.8380 Testing: [233/296] clip 0.1719 (0.7773) video 0.8353 Testing: [234/296] clip 0.3750 (0.7756) video 0.8334 Testing: [235/296] clip 0.9453 (0.7763) video 0.8341 Testing: [236/296] clip 0.9375 (0.7770) video 0.8348 Testing: [237/296] clip 0.8672 (0.7774) video 0.8351 Testing: [238/296] clip 0.9297 (0.7780) video 0.8355 Testing: [239/296] clip 0.5391 (0.7770) video 0.8339 Testing: [240/296] clip 0.9141 (0.7776) video 0.8346 Testing: [241/296] clip 0.7031 (0.7773) video 0.8340 Testing: [242/296] clip 0.9531 (0.7780) video 0.8347 Testing: [243/296] clip 0.9375 (0.7786) video 0.8350 Testing: [244/296] clip 0.9453 (0.7793) video 0.8357 Testing: [245/296] clip 0.9766 (0.7801) video 0.8364 Testing: [246/296] clip 0.9922 (0.7810) video 0.8370 Testing: [247/296] clip 0.9766 (0.7818) video 0.8377 Testing: [248/296] clip 1.0000 (0.7827) video 0.8384 Testing: [249/296] clip 0.8906 (0.7831) video 0.8390 Testing: [250/296] clip 0.9453 (0.7837) video 0.8396 Testing: [251/296] clip 0.4922 (0.7826) video 0.8390 Testing: [252/296] clip 0.7344 (0.7824) video 0.8388 Testing: [253/296] clip 0.7266 (0.7822) video 0.8388 Testing: [254/296] clip 0.6719 (0.7817) video 0.8385 Testing: [255/296] clip 0.9844 (0.7825) video 0.8391 Testing: [256/296] clip 0.9609 (0.7832) video 0.8397 Testing: [257/296] clip 0.9609 (0.7839) video 0.8404 Testing: [258/296] clip 0.6562 (0.7834) video 0.8404 Testing: [259/296] clip 0.8125 (0.7835) video 0.8410 Testing: [260/296] clip 0.9453 (0.7842) video 0.8413 Testing: [261/296] clip 1.0000 (0.7850) video 0.8419 Testing: [262/296] clip 0.9062 (0.7855) video 0.8425 Testing: [263/296] clip 0.5547 (0.7846) video 0.8417 Testing: [264/296] clip 0.5078 (0.7835) video 0.8402 Testing: [265/296] clip 0.3359 (0.7818) video 0.8387 Testing: [266/296] clip 0.5781 (0.7811) video 0.8384 Testing: [267/296] clip 0.9453 (0.7817) video 0.8390 Testing: [268/296] clip 0.9453 (0.7823) video 0.8397 Testing: [269/296] clip 0.9844 (0.7831) video 0.8403 Testing: [270/296] clip 0.8906 (0.7834) video 0.8408 Testing: [271/296] clip 0.9688 (0.7841) video 0.8414 Testing: [272/296] clip 0.8828 (0.7845) video 0.8417 Testing: [273/296] clip 0.9453 (0.7851) video 0.8423 Testing: [274/296] clip 1.0000 (0.7859) video 0.8429 Testing: [275/296] clip 0.8047 (0.7859) video 0.8429 Testing: [276/296] clip 1.0000 (0.7867) video 0.8434 Testing: [277/296] clip 0.9766 (0.7874) video 0.8440 Testing: [278/296] clip 0.9297 (0.7879) video 0.8446 Testing: [279/296] clip 0.9766 (0.7886) video 0.8451 Testing: [280/296] clip 0.8594 (0.7888) video 0.8454 Testing: [281/296] clip 0.6719 (0.7884) video 0.8454 Testing: [282/296] clip 0.9922 (0.7891) video 0.8459 Testing: [283/296] clip 0.9766 (0.7898) video 0.8465 Testing: [284/296] clip 0.9141 (0.7902) video 0.8470 Testing: [285/296] clip 0.9375 (0.7908) video 0.8475 Testing: [286/296] clip 0.9141 (0.7912) video 0.8481 Testing: [287/296] clip 0.5391 (0.7903) video 0.8475 Testing: [288/296] clip 0.3203 (0.7887) video 0.8459 Testing: [289/296] clip 0.5234 (0.7878) video 0.8448 Testing: [290/296] clip 0.6797 (0.7874) video 0.8445 Testing: [291/296] clip 0.8047 (0.7875) video 0.8451 Testing: [292/296] clip 0.8047 (0.7875) video 0.8456 Testing: [293/296] clip 0.8828 (0.7878) video 0.8461 Testing: [294/296] clip 0.9141 (0.7883) video 0.8464 Testing: [295/296] clip 0.8984 (0.7886) video 0.8469 Testing: [296/296] clip 0.9857 (0.7890) video 0.8472

vadimadr commented 5 years ago

Everything seems correct at first glance... 🤔 How do you convert/prepare the data?

Xiehuaiqi commented 5 years ago

@vadimadr First cut the frame of ucf101, then execute python.py ucf101_json.py to generate ucf101_01.json, and use this to do data splitting, the other has not changed. otherwise,I can get the normal score with se_resnext_101_32x4d_vtn_rgb_ucf101_s1.pth

vadimadr commented 5 years ago

I think the problem is in preprocessing. I guess you used ffmpeg and forgot -q option which controls the quality of generated frames. And by default the quality is poor, resulting in images with visual artifacts.

Use the utils/preporcess_videos.py script it will call ffmpeg with correct options for you.

Btw resnext-vtn-rgb model should give 91.8%.

Xiehuaiqi commented 5 years ago

@vadimadr I check my code , I do forgot -q option with ffmpeg,I will try to preprocess my data again.Thank you very much!

Xiehuaiqi commented 5 years ago

@vadimadr I cut the frame in 'python3 preprocess_videos.py -a anno/ucf101_01.json -r UCF-101 -d ucf101_jpg --video-size 240 --video-format frames' and get the score as below:

minhhoangbui commented 4 years ago

@Xiehuaiqi I have the same problem here and I need to clarify your case: You reported that you have a better result with this option

--video-format frames' which means q=4

However, according to their documents, 4 means the lowest quality. I don't really get that