AcodeC commented 5 years ago

When I config my code as you said,I meet a TypeError:

Traceback (most recent call last): File "run_net.py", line 131, in main() File "run_net.py", line 127, in main test(cfg=cfg) File "/media/acodec/media/code/slowfast_feature_extractor/test_net.py", line 90, in test model = model_builder.build_model(cfg) File "/media/acodec/media/code/slowfast_feature_extractor/models/model_builder.py", line 34, in build_model model = _MODEL_TYPEScfg.MODEL.ARCH File "/media/acodec/media/code/slowfast_feature_extractor/models/video_model_builder.py", line 146, in init self._construct_network(cfg) File "/media/acodec/media/code/slowfast_feature_extractor/models/video_model_builder.py", line 211, in _construct_network trans_func_name=cfg.RESNET.TRANS_FUNC, File "/media/acodec/media/code/slowfast/slowfast/models/resnet_helper.py", line 390, in init super(ResStage, self).init() File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72, in init self._construct() TypeError: _construct() missing 10 required positional arguments: 'dim_in', 'dim_out', 'stride', 'dim_inner', 'num_groups', 'trans_func_name', 'stride_1x1', 'inplace_relu', 'nonlocal_inds', and 'instantiation'

What is wrong? How do i get rid of this problem?

tridivb commented 5 years ago

Hi,

Thanks for using the framwork. Could you please provide the following details regarding the issue:

Which model config are you using? Did you modify the paths in it? What is your system config? How should I reproduce the issue?

AcodeC commented 5 years ago

Hi,

Thanks for using the framwork. Could you please provide the following details regarding the issue:

Which model config are you using? Did you modify the paths in it? What is your system config? How should I reproduce the issue?

I use SLOWFAST_8x8_R50.yaml, my config are as follows: TRAIN: ENABLE: False DATASET: videoset CHECKPOINT_FILE_PATH: "./checkpoints/SLOWFAST_8x8_R50.pkl" CHECKPOINT_TYPE: caffe2 DATA: PATH_TO_DATA_DIR: "/media/acodec/Data/msvd" PATH_PREFIX: "/media/acodec/Data/msvd" NUM_FRAMES: 32 SAMPLING_RATE: 2 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3, 3] IN_FPS: 30 OUT_FPS: 15 SLOWFAST: ALPHA: 4 BETA_INV: 8 FUSION_CONV_CHANNEL_RATIO: 2 FUSION_KERNEL_SZ: 7 RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 50 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]] NONLOCAL: LOCATION: [[[], []], [[], []], [[], []], [[], []]] GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]] INSTANTIATION: dot_product BN: USE_PRECISE_STATS: True NUM_BATCHES_PRECISE: 200 MOMENTUM: 0.1 WEIGHT_DECAY: 0.0 SOLVER: BASE_LR: 0.1 LR_POLICY: cosine MAX_EPOCH: 196 MOMENTUM: 0.9 WEIGHT_DECAY: 1e-4 WARMUP_EPOCHS: 37 WARMUP_START_LR: 0.001 OPTIMIZING_METHOD: sgd MODEL: NUM_CLASSES: 400 ARCH: slowfast LOSS_FUNC: cross_entropy DROPOUT_RATE: 0.5 TEST: ENABLE: True DATASET: videoset BATCH_SIZE: 3 DATA_LOADER: NUM_WORKERS: 8 PIN_MEMORY: True NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: "/media/acodec/Data/output"

i installed pytorch 1.2 and pyslowfast, Could you see what is wrong with me?

tridivb commented 5 years ago

The framework is actually built on PyTorch 1.3 (also mentioned in the pre-requisites), I believe even the PySlowFast framework has that requirement (https://github.com/facebookresearch/SlowFast/blob/master/INSTALL.md). Could you install the latest Pytorch, rebuild PySlowFast and then try executing the code? Also don't forget to download and setup the pre-trained weights.

AcodeC commented 5 years ago

I do download download and setup the pre-trained weights, SLOWFAST_8x8_R50.pkl , But i will try install Pytorch 1.3. And I guess it is not the Pytorch Version problems.Because I build PySlowFast projerct successfully. Now I retry and later I report the result.

tridivb commented 5 years ago

Thank you for the confirmation. If you look at the error, it is thrown by the PyTorch module. Please note, there might differences in the methods between different versions. So even if your code builds, there would be runtime errors when the method is actually called. Please let me know once you try out Pytorch 1.3

AcodeC commented 5 years ago

Thank you for the confirmation. If you look at the error, it is thrown by the PyTorch module. Please note, there might differences in the methods between different version. So even your code builds, there would be runtime errors when the method is actually called. Please let me know once you try out Pytorch 1.3

========================================================================== Thanks a lot.Just as you said.It is the pytorch version problem.And i have change to pytorch 1.3,It runs ok.But when i set the NUM_GPUS=2, there are some problem happening:

Traceback (most recent call last): File "run_net.py", line 128, in main() File "run_net.py", line 121, in main daemon=False, File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn while not spawn_context.join(): File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join raise Exception(msg) Exception:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/media/acodec/media/code/slowfast/slowfast/utils/multiprocessing.py", line 50, in run func(cfg) File "/media/acodec/media/code/slowfast_feature_extractor/test_net.py", line 159, in test feat_arr = multi_view_test(test_loader, model, cfg) File "/media/acodec/media/code/slowfast_feature_extractor/test_net.py", line 60, in multi_view_test preds, labels, video_idx = du.all_gather([preds, labels, video_idx]) UnboundLocalError: local variable 'labels' referenced before assignment

maybe the code in the run_net.py has some problem.Have you check that?

tridivb commented 5 years ago

Unfortunately, this is not something I can test as I only have one Gpu to work with. However the problem is not about setting the number of gpus but rather variable assignment in the run_net.py. Since we only want to extract the features, the labels are not returned by the dataloader. I can change this later on and push a commit. Meanwhile, feel free to correct it and test it out.

I am closing this issue as the Pytorch version problem is solved.

tridivb / slowfast_feature_extractor

TypeError: _construct() missing 10 required positional arguments: 'dim_in', 'dim_out', 'stride', 'dim_inner', 'num_groups', 'trans_func_name', 'stride_1x1', 'inplace_relu', 'nonlocal_inds', and 'instantiation' #1

When I config my code as you said,I meet a TypeError:

========================================================================== Thanks a lot.Just as you said.It is the pytorch version problem.And i have change to pytorch 1.3,It runs ok.But when i set the NUM_GPUS=2, there are some problem happening: