temporal action detection

isalirezag commented 5 years ago

I like to do temporal action detection. Can you help on how to evaluated the existing methods on THUMOS14 dataset? I know part of it should be like this: python tools/test_localizer.py configs/thumos14/ssn_thumos14_rgb_bn_inception.py ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [other task-specific arguments] but i know know the rest. should i download {CHECKPOINT_FILE} from some other places? I download the ssn_thumos14_rgb_bn_inception_tag-dac9ddb0.pth and put that in the model zoo.

How should i set the other arguments?

isalirezag commented 5 years ago

do i need to have optical flow frames for this evaluation?

isalirezag commented 5 years ago

I run the code as python tools/test_localizer.py configs/thumos14/ssn_thumos14_rgb_bn_inception.py modelzoo/ssn_thumos14_rgb_bn_inception_tag-dac9ddb0.pth --gpus 1 --out ssn_thumos14_rgb_bn_inception_AlirezaTest.pkl --eval thumos14 and received this error:

alireza@alireza:~/Desktop/mmaction$ python tools/test_localizer.py configs/thumos14/ssn_thumos14_rgb_bn_inception.py modelzoo/ssn_thumos14_rgb_bn_inception_tag-dac9ddb0.pth --gpus 1 --out ssn_thumos14_rgb_bn_inception_AlirezaTest.pkl --eval thumos14
210 out of 1574 videos are valid.

            SSNDataset: proposal file data/thumos14/thumos14_tag_test_proposal_list.txt parsed.

unexpected key in source state_dict: fc.weight, fc.bias

missing keys in source state_dict: inception_4b_1x1_bn.num_batches_tracked, inception_5a_1x1_bn.num_batches_tracked, inception_4c_3x3_bn.num_batches_tracked, inception_5b_double_3x3_2_bn.num_batches_tracked, inception_5a_double_3x3_2_bn.num_batches_tracked, inception_4a_pool_proj_bn.num_batches_tracked, inception_3c_3x3_bn.num_batches_tracked, inception_5a_double_3x3_1_bn.num_batches_tracked, inception_4b_3x3_reduce_bn.num_batches_tracked, inception_3b_3x3_reduce_bn.num_batches_tracked, inception_3b_double_3x3_1_bn.num_batches_tracked, inception_3b_double_3x3_2_bn.num_batches_tracked, inception_3c_double_3x3_1_bn.num_batches_tracked, inception_4c_double_3x3_2_bn.num_batches_tracked, inception_4a_double_3x3_1_bn.num_batches_tracked, inception_3b_3x3_bn.num_batches_tracked, inception_4b_pool_proj_bn.num_batches_tracked, inception_4d_3x3_bn.num_batches_tracked, inception_3c_double_3x3_reduce_bn.num_batches_tracked, inception_4c_double_3x3_1_bn.num_batches_tracked, inception_4d_3x3_reduce_bn.num_batches_tracked, inception_4c_double_3x3_reduce_bn.num_batches_tracked, inception_5a_3x3_reduce_bn.num_batches_tracked, inception_4a_1x1_bn.num_batches_tracked, inception_4a_double_3x3_reduce_bn.num_batches_tracked, inception_4a_3x3_reduce_bn.num_batches_tracked, inception_3c_3x3_reduce_bn.num_batches_tracked, inception_4d_pool_proj_bn.num_batches_tracked, conv2_3x3_bn.num_batches_tracked, inception_4b_double_3x3_1_bn.num_batches_tracked, inception_3a_3x3_bn.num_batches_tracked, inception_4c_1x1_bn.num_batches_tracked, inception_3b_double_3x3_reduce_bn.num_batches_tracked, inception_4c_3x3_reduce_bn.num_batches_tracked, inception_3a_double_3x3_1_bn.num_batches_tracked, inception_4b_3x3_bn.num_batches_tracked, inception_3a_3x3_reduce_bn.num_batches_tracked, inception_5b_1x1_bn.num_batches_tracked, inception_5b_pool_proj_bn.num_batches_tracked, inception_5b_double_3x3_reduce_bn.num_batches_tracked, inception_4d_1x1_bn.num_batches_tracked, inception_3a_double_3x3_2_bn.num_batches_tracked, inception_3a_1x1_bn.num_batches_tracked, inception_3a_double_3x3_reduce_bn.num_batches_tracked, inception_4a_double_3x3_2_bn.num_batches_tracked, inception_4e_3x3_bn.num_batches_tracked, conv2_3x3_reduce_bn.num_batches_tracked, inception_4e_double_3x3_2_bn.num_batches_tracked, inception_3b_1x1_bn.num_batches_tracked, inception_5a_3x3_bn.num_batches_tracked, inception_5b_3x3_reduce_bn.num_batches_tracked, inception_4b_double_3x3_2_bn.num_batches_tracked, inception_4c_pool_proj_bn.num_batches_tracked, inception_5a_pool_proj_bn.num_batches_tracked, inception_3c_double_3x3_2_bn.num_batches_tracked, inception_4d_double_3x3_reduce_bn.num_batches_tracked, inception_4a_3x3_bn.num_batches_tracked, inception_5b_3x3_bn.num_batches_tracked, inception_4b_double_3x3_reduce_bn.num_batches_tracked, inception_5b_double_3x3_1_bn.num_batches_tracked, inception_4e_3x3_reduce_bn.num_batches_tracked, inception_3b_pool_proj_bn.num_batches_tracked, conv1_7x7_s2_bn.num_batches_tracked, inception_4e_double_3x3_reduce_bn.num_batches_tracked, inception_4d_double_3x3_1_bn.num_batches_tracked, inception_3a_pool_proj_bn.num_batches_tracked, inception_4e_double_3x3_1_bn.num_batches_tracked, inception_5a_double_3x3_reduce_bn.num_batches_tracked, inception_4d_double_3x3_2_bn.num_batches_tracked

[                                                  ] 0/210, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/test_localizer.py", line 178, in <module>
    main()
  File "tools/test_localizer.py", line 106, in main
    outputs = single_test(model, data_loader)
  File "tools/test_localizer.py", line 29, in single_test
    result = model(return_loss=False, **data)
  File "/home/alireza/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/alireza/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/alireza/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/alireza/Desktop/mmaction/mmaction/models/localizers/base.py", line 32, in forward
    return self.forward_test(num_modalities, img_meta, **kwargs)
  File "/home/alireza/Desktop/mmaction/mmaction/models/localizers/SSN2D.py", line 169, in forward_test
    x = self.spatial_temporal_module(x)
  File "/home/alireza/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/alireza/Desktop/mmaction/mmaction/models/tenons/spatial_temporal_modules/simple_spatial_module.py", line 25, in forward
    return self.op(input)
  File "/home/alireza/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/alireza/anaconda3/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 563, in forward
    self.padding, self.ceil_mode, self.count_include_pad)
RuntimeError: Given input size: (1024x6x7). Calculated output size: (1024x0x1). Output size is too small at /opt/conda/conda-bld/pytorch_1556653183467/work/aten/src/THCUNN/generic/SpatialAveragePooling.cu:47
alireza@alireza:~/Desktop/mmaction$

zhaoyue-zephyrus commented 5 years ago

@isalirezag Looking at the error, it seems like the input size is too small. Could you check that?

isalirezag commented 5 years ago

hmmm can you please tell where should i change it, i didnot change any part of the code. should I define input size somewhere?

zhaoyue-zephyrus commented 5 years ago

I tried running the code and there is no problem.

In addition, the input size has been defined in the config: https://github.com/open-mmlab/mmaction/blob/master/configs/thumos14/ssn_thumos14_rgb_bn_inception.py#L82, therefore the input will be resize to 340x256 and a 224x224 crop will be obtain thereon.

Maybe you have to check you input, for example are all of them successfully extracted?

smilefish06 commented 5 years ago

I met the same problem. "RuntimeError: Given input size: (1024x6x7). Calculated output size: (1024x0x1). Output size is too small at /pytorch/aten/src/THCUNN/generic/SpatialAveragePooling.cu:47 ". The input is from x = self.extract_feat(chunk.cuda()) in SSN2D.py

Is this caused by the the missing keys in source state_dict?

zhaoyue-zephyrus commented 5 years ago

Please refer to #63 for details.

open-mmlab / mmaction

temporal action detection #35