facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.56k stars 1.21k forks source link

Demo run_net.py code problem #252

Open dyisaev opened 4 years ago

dyisaev commented 4 years ago

Hi, yesterday I downloaded SlowFast and now I am trying to run demo on my video, using the following command: python tools/run_net.py --cfg demo/AVA/SLOWFAST_32x2_R101_50_50_MYTEST.yaml I am running it on a video, following the instructions in "getting_started.md" (I changed _C.DEMO.DATA_SOURCE="" instead of 0, and edited yaml file). However i run into the following error:

Traceback (most recent call last): File "tools/run_net.py", line 37, in main() File "tools/run_net.py", line 19, in main cfg = load_config(args) File "/media/st4Tb/fbtools/slowfast/slowfast/utils/parser.py", line 78, in load_config cfg.merge_from_file(args.cfg_file) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/fvcore/common/config.py", line 109, in merge_from_file self.merge_from_other_cfg(loaded_cfg) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/fvcore/common/config.py", line 120, in merge_from_other_cfg return super().merge_from_other_cfg(cfg_other) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/yacs/config.py", line 217, in merge_from_other_cfg _merge_a_into_b(cfg_other, self, self, []) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/yacs/config.py", line 464, in _merge_a_into_b _merge_a_into_b(v, b[k], root, key_list + [k]) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/yacs/config.py", line 477, in _merge_a_into_b raise KeyError("Non-existent config key: {}".format(full_key)) KeyError: 'Non-existent config key: BN.MOMENTUM'

When I remove the entire BN section of yaml file, I get another error, connected with tensor sizes in the model: Warnings:

/media/st4Tb/fbtools/slowfast/slowfast/models/head_helper.py:111: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert out.shape[2] == 1 /media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/detectron2/layers/roi_align.py:93: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

and finally the error:

Traceback (most recent call last): File "tools/run_net.py", line 37, in main() File "tools/run_net.py", line 30, in main launch_job(cfg=cfg, init_method=args.init_method, func=demo) File "/media/st4Tb/fbtools/slowfast/slowfast/utils/misc.py", line 282, in launch_job func(cfg=cfg) File "/media/st4Tb/fbtools/slowfast/tools/demo_net.py", line 103, in demo misc.log_model_info(model, cfg) File "/media/st4Tb/fbtools/slowfast/slowfast/utils/misc.py", line 160, in log_model_info get_model_stats(model, cfg, "flop", use_train_input) File "/media/st4Tb/fbtools/slowfast/slowfast/utils/misc.py", line 138, in get_model_stats countdict, = model_stats_fun(model, inputs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/fvcore/nn/flop_count.py", line 55, in flop_count total_flop_counter, skipped_ops = get_jit_model_analysis( File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/fvcore/nn/jit_handles.py", line 98, in get_jit_modelanalysis trace, = torch.jit._get_trace_graph(model, inputs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/jit/init.py", line 277, in _get_trace_graph outs = ONNXTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, kwargs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, kwargs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/jit/init.py", line 356, in forward graph, out = torch._C._create_graph_by_tracing( File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/jit/init.py", line 347, in wrapper outs.append(self.inner(trace_inputs)) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in call result = self._slow_forward(input, kwargs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 516, in _slow_forward result = self.forward(*input, kwargs) File "/media/st4Tb/fbtools/slowfast/slowfast/models/video_model_builder.py", line 393, in forward x = self.head(x, bboxes) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in call result = self._slow_forward(*input, kwargs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 516, in _slow_forward result = self.forward(*input, *kwargs) File "/media/st4Tb/fbtools/slowfast/slowfast/models/head_helper.py", line 129, in forward x = self.act(x) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in call result = self._slow_forward(input, kwargs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 516, in _slow_forward result = self.forward(*input, **kwargs) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/modules/activation.py", line 1018, in forward return F.softmax(input, self.dim, _stacklevel=5) File "/media/st4Tb/anaconda3/envs/torchenv/lib/python3.8/site-packages/torch/nn/functional.py", line 1231, in softmax ret = input.softmax(dim) IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 4)

this is my YAML file (without BN section):

TRAIN: ENABLE: False DATASET: ava BATCH_SIZE: 16 EVAL_PERIOD: 1 CHECKPOINT_PERIOD: 1 AUTO_RESUME: True CHECKPOINT_FILE_PATH: "/media/st4Tb/fbtools/slowfast/demo/SLOWFAST_32x2_R101_50_50_v2.1.pkl" #path to pretrain model CHECKPOINT_TYPE: pytorch DATA: NUM_FRAMES: 32 SAMPLING_RATE: 2 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3, 3] DETECTION: ENABLE: True ALIGNED: False AVA: BGR: False DETECTION_SCORE_THRESH: 0.8 TEST_PREDICT_BOX_LISTS: ["person_box_67091280_iou90/ava_detection_val_boxes_and_labels.csv"] SLOWFAST: ALPHA: 4 BETA_INV: 8 FUSION_CONV_CHANNEL_RATIO: 2 FUSION_KERNEL_SZ: 5 RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 101 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]] SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [2, 2]] SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [1, 1]] NONLOCAL: LOCATION: [[[], []], [[], []], [[6, 13, 20], []], [[], []]] GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]] INSTANTIATION: dot_product POOL: [[[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]]] SOLVER: MOMENTUM: 0.9 WEIGHT_DECAY: 1e-7 OPTIMIZING_METHOD: sgd MODEL: NUM_CLASSES: 80 ARCH: slowfast LOSS_FUNC: bce DROPOUT_RATE: 0.5 TEST: ENABLE: False DATASET: ava BATCH_SIZE: 8 DATA_LOADER: NUM_WORKERS: 2 PIN_MEMORY: True DEMO: ENABLE: True LABEL_FILE_PATH: "./demo/AVA/ava.names" DATA_SOURCE: "/media/st4Tb/fbtools/slowfast/demo/input_30sec.mp4"

DISPLAY_WIDTH: 640

DISPLAY_HEIGHT: 480

DETECTRON2_OBJECT_DETECTION_MODEL_CFG: "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml" DETECTRON2_OBJECT_DETECTION_MODEL_WEIGHTS: "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl" NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

thank you in advance for your help!

lequytra commented 4 years ago

Hi @dyisaev, could you try using the SLOWFAST_32x2_R101_50_50.yaml config in configs/AVA/c2 instead? Or I think adding HEAD_ACT: sigmoid under MODEL:also works as well.

doursand commented 4 years ago

I've got the exact same error, @dyisaev did you managed to fix it ?

doursand commented 4 years ago

Ok so I think I figured this out. what I did is to completely remove the solver section and the BN.MOMENTUM param Once done, I was able to successfully run the demo tool using simply the path to my modified yaml file ...

python tools/run_net.py --cfg configs/Kinetics/SLOW_4x16_R50-AD-demo.yaml

TRAIN: ENABLE: False DATASET: kinetics BATCH_SIZE: 64 EVAL_PERIOD: 10 CHECKPOINT_PERIOD: 1 AUTO_RESUME: True CHECKPOINT_TYPE: pytorch CHECKPOINT_FILE_PATH: './checkpoints/checkpoint_epoch_00114.pyth' DATA: NUM_FRAMES: 32 SAMPLING_RATE: 2 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3, 3] SLOWFAST: ALPHA: 4 BETA_INV: 8 FUSION_CONV_CHANNEL_RATIO: 2 FUSION_KERNEL_SZ: 7 RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 50 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]] SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [2, 2]] SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [1, 1]] NONLOCAL: LOCATION: [[[], []], [[], []], [[], []], [[], []]] GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]] INSTANTIATION: dot_product MODEL: NUM_CLASSES: 3 ARCH: slowfast LOSS_FUNC: cross_entropy DROPOUT_RATE: 0.5 TEST: ENABLE: False DATASET: kinetics BATCH_SIZE: 64 DATA_LOADER: NUM_WORKERS: 8 PIN_MEMORY: True DEMO: ENABLE: True LABEL_FILE_PATH: "labels.csv" DATA_SOURCE: './training/EATING/video(51)_clip_00000002.mp4'

DISPLAY_WIDTH: 640

DISPLAY_HEIGHT: 480

NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

Milkve commented 4 years ago

I also met the same problem, @doursand you mean remove BN and SOLVER section at a time? Your solution said only move BN.MOMENTUM and the whole SOLVER section, whereas in the modified .yaml file the BN section were all missing?